commit acb364df89862758d592ab23a20d018795fff5e9
parent 7edf2ee5e60e096d457b441466e7a3d5ed7f67b2
Author: Andrew Laack <andrew@laack.co>
Date: Sat, 27 Sep 2025 21:08:04 -0500
Got basic site stuff working
Diffstat:
10 files changed, 518 insertions(+), 210 deletions(-)
diff --git a/posts/entries/sustainability-of-youtube.md b/posts/entries/sustainability-of-youtube.md
@@ -0,0 +1,100 @@
+# The Sustainability of YouTube
+
+## Context
+
+I dislike using cloud services because they may discontinue my service [1] or they may do something stupid [2] that negatively impacts me. These concerns, along with concerns about privacy [3], have led me to keep information and content I care about away from cloud services. This does make me wonder, how many people would be distraught about the loss of their content if YouTube terminated their accounts? This is not the topic today, nor is it something I can easily answer, but it is something I wonder about and would like others to consider.
+
+Similarly, I am skeptical of 'free' services. It's incorrect to say "if something is free, you are the product" because charity does exist, but when it comes to Google, they aren't a charity. Their current model with YouTube is to have people upload videos to their site and show ads to some users when they watch said videos. There are also paid subscriptions, but their primary monetization comes from ads. An important point is they don't purge content on a regular basis, except in cases of ToS violations. As such, there is a (nearly) monotonically increasing function that describes the storage requirements of YouTube. This motivates my question below.
+
+## Question
+
+When will YouTube's storage costs exceed their revenue if they don't start purging old content, assuming their revenue does not increase over time?
+
+## How to Answer This Question
+
+We need the following information to answer this question:
+
+- What is YouTube's annual net profit?
+- How much data does YouTube store?
+- How much does data storage cost?
+
+## YouTube's Profit
+
+According to Alphabet's 2025 Q2 earnings release [4], YouTube ads made a revenue of $9.769 billion. Annualized, this is $39.076 billion, but this is only revenue, not net profit. If we assume the operating margin across Alphabet matches the operating margin of YouTube (32%), we find an approximate net profit of $12.50432 billion / year. Actual net profit could differ from this, but since we are concerned with how much data storage this could support, we don't need to factor in how this would be taxed.
+
+## Storage Needs
+
+### Total Videos
+
+YouTube states on their official blog there are over 20 million videos uploaded per day [5]. While I don't trust YouTube very much, and they don't have many incentives to be honest on this topic, they seem more trustworthy in this context than the slop factory sites as they are, in fact, the ones who are hosting the content. As such, I will accept this metric.
+
+### Average Video Size
+
+I wrote a python script that uses a curated list of popular Google Trends searches over the past few decades [6] to search YouTube for recently uploaded videos. I ran this script and compiled a list of ~7.65 million YouTube videos.
+
+Before continuing, I will list a few limitations of this approach:
+
+- YouTube likely imposes some amount of algorithmic filtering when sorting by 'recently uploaded'
+- The videos in question are all public (not inclusive of private/unlisted videos)
+- Less popular search terms may have a different distribution of video sizes
+
+These are the main flaws in my methodology, but any approach will be imperfect without being able to get the data directly from YouTube.
+
+Of these 7.65 million videos, I sampled 615,222 of them and queried YouTube using `yt-dlp` [7] to find all video resolutions and formats YouTube will serve.
+It seems unlikely to me that YouTube stores each of these resolutions on their servers, but I think it is very likely that YouTube is storing the highest resolution version they are willing to serve to users.
+
+Based on my findings, I propose a lower bound of ~396.17 MB / video, which assumes they are only storing the highest resolution version and all other versions are generated in real time via transcoding (I am confident this isn't the case, but it provides a nice lower bound). I also propose an upper bound of ~1.44 GB / video, which assumes they are storing every resolution and format for each video they are serving.
+
+All of the code used for this is available on my git server [8].
+
+### Annual Storage Increase
+
+Using my findings above about video size and YouTube's stated video upload rate, we find:
+
+Lower bound:
+
+- 7.923 PB / Day
+- 2.89 EB / Year
+
+Upper bound:
+
+- 28.895 PB / Day
+- 10.547 EB / Year
+
+Note: These values may vary depending on rounding, but they should be similar to what anyone else would find.
+
+## Storage Cost by Volume
+
+GCP currently charges $26 / month for 1 TB of standard multi-region, US based, cloud storage [9]. If we assume the same 32% profit margin as before, this would cost ~$17.68 / TB / month or $212.16 / TB / year. I don't know if this is high or low relative to what they actually pay. YouTube requires quick access to many of their videos, but many of their videos are likely retrieved infrequently. Additionally, it seems likely Alphabet's cloud storage margins are higher than the average margins across the organization. Additionally, these are only US storage prices so this could vary depending on the regions this data is being hosted in. In any case, I think this is a fair estimate.
+
+## Answer to the Question
+
+Given YouTube's approximated net profit of $12.50432 billion / year and an estimated cost of $212.16 / TB / year for cloud storage, we find their profits can support an additional ~58.94 EB of data.
+
+At the lower bound of 2.89 EB / year we find YouTube's storage costs will surpass their current profits in ~20.39 years.
+
+If we assume our upper bound of 10.547 EB / year we find YouTube's storage costs will surpass their current profits in ~5.59 years.
+
+## Conclusion
+
+These are very rough bounds, especially given how difficult it is to estimate the cost per TB / year for storage of this data given their retrieval needs, but we find that in ~5.59 - ~20.39 years, YouTube will be forced to start purging old content to remain profitable at their current profit rate.
+
+## Citations
+
+[1] - [https://killedbygoogle.com/](https://killedbygoogle.com/)
+
+[2] - [https://arstechnica.com/gadgets/2024/05/google-cloud-accidentally-nukes-customer-account-causes-two-weeks-of-downtime/](https://arstechnica.com/gadgets/2024/05/google-cloud-accidentally-nukes-customer-account-causes-two-weeks-of-downtime/)
+
+[3] - [https://www.gnu.org/proprietary/proprietary-surveillance.html](https://www.gnu.org/proprietary/proprietary-surveillance.html)
+
+[4] - [https://abc.xyz/assets/cc/27/3ada14014efbadd7a58472f1f3f4/2025q2-alphabet-earnings-release.pdf](https://abc.xyz/assets/cc/27/3ada14014efbadd7a58472f1f3f4/2025q2-alphabet-earnings-release.pdf)
+
+[5] - [https://web.archive.org/web/20250911091711/https://blog.youtube/press/](https://web.archive.org/web/20250911091711/https://blog.youtube/press/)
+
+[6] - [https://www.kaggle.com/datasets/dhruvildave/google-trends-dataset](https://www.kaggle.com/datasets/dhruvildave/google-trends-dataset)
+
+[7] - [https://github.com/yt-dlp/yt-dlp](https://github.com/yt-dlp/yt-dlp)
+
+[8] - [http://git.laack.co/blog/log.html](http://git.laack.co/blog/log.html)
+
+[9] - [https://cloud.google.com/storage/pricing#multi-regions](https://cloud.google.com/storage/pricing#multi-regions)
diff --git a/posts/wikipedia-and-truth-on-the-internet.md b/posts/entries/wikipedia-and-truth-on-the-internet.md
diff --git a/posts/index.md b/posts/index.md
@@ -1,8 +0,0 @@
-
-
-
-## Most Recent Blog Posts
-
-=> https://blog.laack.co/posts/wikipedia-and-truth-on-the-internet.gmi Truth on the Internet
-=> https://blog.laack.co/posts/sustainability-of-youtube.gmi The Sustainability of YouTube
-
diff --git a/posts/site/style.css b/posts/site/style.css
@@ -0,0 +1,97 @@
+ .spacer{
+ height: 50px;
+ }
+ .pgp-container {
+ text-align: center;
+ padding: 20px;
+ margin-top: 20px;
+ background-color: #ffffff;
+ width: 100%;
+ max-width: 900px;
+ margin-left: auto;
+ margin-right: auto;
+ }
+
+ .pgp-key {
+ white-space: pre-wrap;
+ word-wrap: break-word;
+ }
+ body {
+ font-family:'Lucida Console', monospace;
+ font-size: 18px;
+ color: #111;
+ background-color: white;
+ max-width: 1280px;
+ margin: 0 auto;
+ padding: 0 50px;
+ box-sizing: border-box;
+ }
+ .landing {
+ display: flex;
+ flex-direction: column;
+ align-items: center;
+ justify-content: center;
+ min-height:100vh; min-height:100dvh;
+ text-align: center;
+ }
+ svg { width: 100%; height: auto; }
+ .svg-container { width: 100%; max-width: 400px; display: inline-block; }
+ .separator { letter-spacing: -9px; }
+ hr { border: none; height: 2px; background-color: lightgrey; width: 100%; margin: 30px auto; }
+ table { border-collapse: collapse; text-align: center; width: 100%; }
+ td {
+ padding: 5px 20px;
+ border: 1px solid #dddddd;
+ }
+ .quiet, .links-container a { color: inherit; }
+ .links-container { font-size: 24px; margin: 20px; }
+ .links-container a { text-decoration: none; margin: 0 10px; }
+ .product-photo { display: block; max-height: 640px; width: 100%; margin: 0 auto; }
+ .faqtable dt { margin-bottom: 0.75em; font-style: italic; }
+ .faqtable dd { margin-left: 0; margin-bottom: 2em; }
+
+h1 {
+ font-family:'Lucida Console', monospace;
+ font-size: 2.25rem;
+ margin-bottom: 20px;
+ text-align: left;
+}
+
+/* Styling for the navigation container */
+.navigation {
+ display: flex;
+ justify-content: left;
+ align-items: center;
+ gap: 15px;
+ font-size: 1.25rem;
+ font-family:'Lucida Console', monospace;
+}
+
+/* Styling for the links */
+.navigation a {
+ text-decoration: none;
+ color: #333; /* Dark gray for a clean, modern look */
+ transition: color 0.3s;
+}
+
+.navigation a:hover {
+ color: #007bff;
+}
+
+a {
+ color: #007bff;
+ text-decoration: none;
+}
+
+a:hover {
+ color: #0056b3;
+ text-decoration: underline;
+}
+
+
+/* Styling for the separator */
+.separator {
+ color: #ccc; /* Lighter gray for separator */
+ font-weight: normal;
+ margin: 0 5px;
+}
diff --git a/posts/site/sustainability-of-youtube.md.html b/posts/site/sustainability-of-youtube.md.html
@@ -0,0 +1,167 @@
+<!DOCTYPE html>
+<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
+<head>
+ <meta charset="utf-8" />
+ <meta name="generator" content="pandoc" />
+ <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
+ <title>sustainability-of-youtube</title>
+ <style>
+ code{white-space: pre-wrap;}
+ span.smallcaps{font-variant: small-caps;}
+ div.columns{display: flex; gap: min(4vw, 1.5em);}
+ div.column{flex: auto; overflow-x: auto;}
+ div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
+ /* The extra [class] is a hack that increases specificity enough to
+ override a similar rule in reveal.js */
+ ul.task-list[class]{list-style: none;}
+ ul.task-list li input[type="checkbox"] {
+ font-size: inherit;
+ width: 0.8em;
+ margin: 0 0.8em 0.2em -1.6em;
+ vertical-align: middle;
+ }
+ .display.math{display: block; text-align: center; margin: 0.5rem auto;}
+ </style>
+ <link rel="stylesheet" href="style.css" />
+</head>
+<body>
+<h1 id="the-sustainability-of-youtube">The Sustainability of
+YouTube</h1>
+<h2 id="context">Context</h2>
+<p>I dislike using cloud services because they may discontinue my
+service [1] or they may do something stupid [2] that negatively impacts
+me. These concerns, along with concerns about privacy [3], have led me
+to keep information and content I care about away from cloud services.
+This does make me wonder, how many people would be distraught about the
+loss of their content if YouTube terminated their accounts? This is not
+the topic today, nor is it something I can easily answer, but it is
+something I wonder about and would like others to consider.</p>
+<p>Similarly, I am skeptical of ‘free’ services. It’s incorrect to say
+“if something is free, you are the product” because charity does exist,
+but when it comes to Google, they aren’t a charity. Their current model
+with YouTube is to have people upload videos to their site and show ads
+to some users when they watch said videos. There are also paid
+subscriptions, but their primary monetization comes from ads. An
+important point is they don’t purge content on a regular basis, except
+in cases of ToS violations. As such, there is a (nearly) monotonically
+increasing function that describes the storage requirements of YouTube.
+This motivates my question below.</p>
+<h2 id="question">Question</h2>
+<p>When will YouTube’s storage costs exceed their revenue if they don’t
+start purging old content, assuming their revenue does not increase over
+time?</p>
+<h2 id="how-to-answer-this-question">How to Answer This Question</h2>
+<p>We need the following information to answer this question:</p>
+<ul>
+<li>What is YouTube’s annual net profit?</li>
+<li>How much data does YouTube store?</li>
+<li>How much does data storage cost?</li>
+</ul>
+<h2 id="youtubes-profit">YouTube’s Profit</h2>
+<p>According to Alphabet’s 2025 Q2 earnings release [4], YouTube ads
+made a revenue of $9.769 billion. Annualized, this is $39.076 billion,
+but this is only revenue, not net profit. If we assume the operating
+margin across Alphabet matches the operating margin of YouTube (32%), we
+find an approximate net profit of $12.50432 billion / year. Actual net
+profit could differ from this, but since we are concerned with how much
+data storage this could support, we don’t need to factor in how this
+would be taxed.</p>
+<h2 id="storage-needs">Storage Needs</h2>
+<h3 id="total-videos">Total Videos</h3>
+<p>YouTube states on their official blog there are over 20 million
+videos uploaded per day [5]. While I don’t trust YouTube very much, and
+they don’t have many incentives to be honest on this topic, they seem
+more trustworthy in this context than the slop factory sites as they
+are, in fact, the ones who are hosting the content. As such, I will
+accept this metric.</p>
+<h3 id="average-video-size">Average Video Size</h3>
+<p>I wrote a python script that uses a curated list of popular Google
+Trends searches over the past few decades [6] to search YouTube for
+recently uploaded videos. I ran this script and compiled a list of ~7.65
+million YouTube videos.</p>
+<p>Before continuing, I will list a few limitations of this
+approach:</p>
+<ul>
+<li>YouTube likely imposes some amount of algorithmic filtering when
+sorting by ‘recently uploaded’</li>
+<li>The videos in question are all public (not inclusive of
+private/unlisted videos)</li>
+<li>Less popular search terms may have a different distribution of video
+sizes</li>
+</ul>
+<p>These are the main flaws in my methodology, but any approach will be
+imperfect without being able to get the data directly from YouTube.</p>
+<p>Of these 7.65 million videos, I sampled 615,222 of them and queried
+YouTube using <code>yt-dlp</code> [7] to find all video resolutions and
+formats YouTube will serve. It seems unlikely to me that YouTube stores
+each of these resolutions on their servers, but I think it is very
+likely that YouTube is storing the highest resolution version they are
+willing to serve to users.</p>
+<p>Based on my findings, I propose a lower bound of ~396.17 MB / video,
+which assumes they are only storing the highest resolution version and
+all other versions are generated in real time via transcoding (I am
+confident this isn’t the case, but it provides a nice lower bound). I
+also propose an upper bound of ~1.44 GB / video, which assumes they are
+storing every resolution and format for each video they are serving.</p>
+<p>All of the code used for this is available on my git server [8].</p>
+<h3 id="annual-storage-increase">Annual Storage Increase</h3>
+<p>Using my findings above about video size and YouTube’s stated video
+upload rate, we find:</p>
+<p>Lower bound:</p>
+<ul>
+<li>7.923 PB / Day</li>
+<li>2.89 EB / Year</li>
+</ul>
+<p>Upper bound:</p>
+<ul>
+<li>28.895 PB / Day</li>
+<li>10.547 EB / Year</li>
+</ul>
+<p>Note: These values may vary depending on rounding, but they should be
+similar to what anyone else would find.</p>
+<h2 id="storage-cost-by-volume">Storage Cost by Volume</h2>
+<p>GCP currently charges $26 / month for 1 TB of standard multi-region,
+US based, cloud storage [9]. If we assume the same 32% profit margin as
+before, this would cost ~$17.68 / TB / month or $212.16 / TB / year. I
+don’t know if this is high or low relative to what they actually pay.
+YouTube requires quick access to many of their videos, but many of their
+videos are likely retrieved infrequently. Additionally, it seems likely
+Alphabet’s cloud storage margins are higher than the average margins
+across the organization. Additionally, these are only US storage prices
+so this could vary depending on the regions this data is being hosted
+in. In any case, I think this is a fair estimate.</p>
+<h2 id="answer-to-the-question">Answer to the Question</h2>
+<p>Given YouTube’s approximated net profit of $12.50432 billion / year
+and an estimated cost of $212.16 / TB / year for cloud storage, we find
+their profits can support an additional ~58.94 EB of data.</p>
+<p>At the lower bound of 2.89 EB / year we find YouTube’s storage costs
+will surpass their current profits in ~20.39 years.</p>
+<p>If we assume our upper bound of 10.547 EB / year we find YouTube’s
+storage costs will surpass their current profits in ~5.59 years.</p>
+<h2 id="conclusion">Conclusion</h2>
+<p>These are very rough bounds, especially given how difficult it is to
+estimate the cost per TB / year for storage of this data given their
+retrieval needs, but we find that in ~5.59 - ~20.39 years, YouTube will
+be forced to start purging old content to remain profitable at their
+current profit rate.</p>
+<h2 id="citations">Citations</h2>
+<p>[1] - <a
+href="https://killedbygoogle.com/">https://killedbygoogle.com/</a></p>
+<p>[2] - <a
+href="https://arstechnica.com/gadgets/2024/05/google-cloud-accidentally-nukes-customer-account-causes-two-weeks-of-downtime/">https://arstechnica.com/gadgets/2024/05/google-cloud-accidentally-nukes-customer-account-causes-two-weeks-of-downtime/</a></p>
+<p>[3] - <a
+href="https://www.gnu.org/proprietary/proprietary-surveillance.html">https://www.gnu.org/proprietary/proprietary-surveillance.html</a></p>
+<p>[4] - <a
+href="https://abc.xyz/assets/cc/27/3ada14014efbadd7a58472f1f3f4/2025q2-alphabet-earnings-release.pdf">https://abc.xyz/assets/cc/27/3ada14014efbadd7a58472f1f3f4/2025q2-alphabet-earnings-release.pdf</a></p>
+<p>[5] - <a
+href="https://web.archive.org/web/20250911091711/https://blog.youtube/press/">https://web.archive.org/web/20250911091711/https://blog.youtube/press/</a></p>
+<p>[6] - <a
+href="https://www.kaggle.com/datasets/dhruvildave/google-trends-dataset">https://www.kaggle.com/datasets/dhruvildave/google-trends-dataset</a></p>
+<p>[7] - <a
+href="https://github.com/yt-dlp/yt-dlp">https://github.com/yt-dlp/yt-dlp</a></p>
+<p>[8] - <a
+href="http://git.laack.co/blog/log.html">http://git.laack.co/blog/log.html</a></p>
+<p>[9] - <a
+href="https://cloud.google.com/storage/pricing#multi-regions">https://cloud.google.com/storage/pricing#multi-regions</a></p>
+</body>
+</html>
diff --git a/posts/site/wikipedia-and-truth-on-the-internet.md.html b/posts/site/wikipedia-and-truth-on-the-internet.md.html
@@ -0,0 +1,139 @@
+<!DOCTYPE html>
+<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
+<head>
+ <meta charset="utf-8" />
+ <meta name="generator" content="pandoc" />
+ <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
+ <title>wikipedia-and-truth-on-the-internet</title>
+ <style>
+ code{white-space: pre-wrap;}
+ span.smallcaps{font-variant: small-caps;}
+ div.columns{display: flex; gap: min(4vw, 1.5em);}
+ div.column{flex: auto; overflow-x: auto;}
+ div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
+ /* The extra [class] is a hack that increases specificity enough to
+ override a similar rule in reveal.js */
+ ul.task-list[class]{list-style: none;}
+ ul.task-list li input[type="checkbox"] {
+ font-size: inherit;
+ width: 0.8em;
+ margin: 0 0.8em 0.2em -1.6em;
+ vertical-align: middle;
+ }
+ .display.math{display: block; text-align: center; margin: 0.5rem auto;}
+ </style>
+ <link rel="stylesheet" href="style.css" />
+</head>
+<body>
+<h1 id="truth-on-the-internet">Truth on the Internet</h1>
+<h2 id="claim">Claim</h2>
+<p>Wikipedia is an okay source for purely factual information like
+mathematical topic, CS topics, etc., but it is not a good source for
+other kinds of information. Similarly, there are other internet sources
+that are good for academic information, but there is a lack of rigor in
+most other topics.</p>
+<h2 id="reasoning">Reasoning</h2>
+<p>Wikipedia has been a good source of information on computer science
+and math topics for me. In general, articles are correct and I
+appreciate the formatting of the site working in my preferred web
+browser, lynx. When it comes to more subjective topics, it has not been
+a very good source for me. This also applies to the broader internet as
+well, albeit it is not consistently good on academic topics.</p>
+<h3 id="gemini">Gemini</h3>
+<p>Consider the Gemini protocol article on Wikipedia [1]. Most of the
+information in the article is acceptable, discussing actual information
+about the protocol, its history, and other related concepts, but the
+reception section is quite problematic. I will allow you to read it
+yourselves.</p>
+<blockquote>
+<p>Gemini is praised for its simplicity but criticized for “excluding
+people who use ordinary web browsers”. Gemini’s usefulness has been said
+to be “dependent on the kinds of content available on Gemini and whether
+it appeals or not”. Stéphane Bortzmeyer has said Gemini is retro but
+with modern features. Daniel Stenberg reviewed the 0.16.1 protocol spec,
+and criticized it as weak on security (Trust on first use) and slow in
+performance (short-lived bursty TCP connections) if it was ever used to
+transfer resource heavy HTML pages.[13] Gemini pages are usually
+downloaded as gemtext only without requesting fonts or linked resources
+such as images.</p>
+</blockquote>
+<p>Let’s break down a few things.</p>
+<h4 id="gemini-excludes-people">Gemini Excludes People</h4>
+<blockquote>
+<p>Gemini is praised for its simplicity but criticized for “excluding
+people who use ordinary web browsers”.</p>
+</blockquote>
+<p>A correction to this may be “it was once criticized by an individual
+for being exclusive, but said individual was misinformed”. The only
+exclusion that is happening is browser makers do not support the
+protocol. There is no reason a normal web browser can’t support Gemini,
+and no one who is developing the gemini protocol pushes back against its
+adoption in major web browsers.</p>
+<p>Such a criticism is similar to criticizing HTTP because none of the
+Gemini browsers support it. Gemini is an exceptionally easy protocol to
+implement and to criticize it for exclusivity is a bit silly.</p>
+<h4 id="security-concerns">Security Concerns</h4>
+<blockquote>
+<p>Daniel Stenberg reviewed the 0.16.1 protocol spec, and criticized it
+as weak on security (Trust on first use)</p>
+</blockquote>
+<p>I view his first criticism as a selling point of Gemini. TOFU is how
+I believe the internet should work, or maybe DANES, but certainly not
+CAs. CAs are antithetical to the spirit of the internet. The internet is
+supposed to be free of authorities, but when CAs are considered the
+authority on who is able to host a website with HTTPS, which is
+functionally a necessity to have a voice on the internet, we do have
+arbiters of who can speak, and we lose freedom.</p>
+<p>I think the term CA is a misnomer because there is nothing
+authoritative about CAs, and there is no authority on the internet.
+Furthermore, there have been many CA incidents in the past [2] [3],
+which is to be expected when authorities are appointed.</p>
+<h4 id="slowness">Slowness</h4>
+<blockquote>
+<p>slow in performance (short-lived bursty TCP connections) if it was
+ever used to transfer resource heavy HTML pages.[13] Gemini pages are
+usually downloaded as gemtext only without requesting fonts or linked
+resources such as images.</p>
+</blockquote>
+<p>I seem to be missing something here. A sportscar is slow when it is
+towing a tree up a hill, but that doesn’t make a sportscar slow. If the
+Gemini protocol were saddled with the burden of HTML then yes, it would
+be slow, but that’s the thing, it’s not. It’s made for gemtext not
+HTML/JS soydevery.</p>
+<h3 id="youtube-video-uploading-metrics">YouTube Video Uploading
+Metrics</h3>
+<p>YouTube video upload metrics are what prompted me to write this post.
+I was doing research for a project where I wanted to determine how long
+YouTube’s business model of not deleting any videos on the platform,
+apart from those breaking ToS, could be sustained [4]. I started with a
+duckduckgo search and the top search result claimed that in February
+2025 2.6 million videos were uploaded every day [5]. This cite then
+linked to another site [6] which made no such assertion. I returned to
+the original site to realize I’d been tricked. I should’ve known the
+domain SEO.ai was an AI slop SEO farming site, but it was the top one. I
+then went to Google to see what results I would get there. The same
+result showed at the top, but this time I got an AI summary which stated
+the exact same hallucinated claim.</p>
+<p>Truth can’t be found on the open web anymore. There are too many
+layers of nonsense. We have an AI summarizing an AI that summarized an
+unsubstantiated blog post that didn’t even make the claim the first AI
+summary thought it did.</p>
+<h2 id="what-can-be-done">What Can Be Done</h2>
+<p>At present, we can stop, or limit, our usage of search engines. I
+plan to replace search engines with going directly to sites that I know
+are useful, using RSS, and trying to reference books as much as
+possible. It’s modestly disappointing, but good options are lacking
+right now. If you have any thoughts on this, feel free to email me!</p>
+<h2 id="sources">Sources</h2>
+<p>=> https://en.wikipedia.org/wiki/Gemini_(protocol) Gemini Protocol
+Wikipedia Article => https://en.wikipedia.org/wiki/DigiNotar CA
+Hacked =>
+https://sslmate.com/resources/certificate_authority_failures Even More
+CA Issues => gemini://blog.laack.co/sustainability-of-youtube.gmi The
+Sustainability of YouTube =>
+https://web.archive.org/web/20250814122654/https://seo.ai/blog/how-many-videos-are-on-youtube
+AI Summary of an AI Summary =>
+https://web.archive.org/web/20250304100048/https://photutorial.com/how-many-videos-on-youtube/
+AI Summary of a Poorly Researched Blog Post</p>
+</body>
+</html>
diff --git a/posts/style.css b/posts/style.css
@@ -1,99 +0,0 @@
- <style>
- .spacer{
- height: 50px;
- }
- .pgp-container {
- text-align: center;
- padding: 20px;
- margin-top: 20px;
- background-color: #ffffff;
- width: 100%;
- max-width: 900px;
- margin-left: auto;
- margin-right: auto;
- }
-
- .pgp-key {
- white-space: pre-wrap;
- word-wrap: break-word;
- }
- body {
- font-family:'Lucida Console', monospace;
- font-size: 18px;
- color: #111;
- background-color: white;
- max-width: 1280px;
- margin: 0 auto;
- padding: 0 50px;
- box-sizing: border-box;
- }
- .landing {
- display: flex;
- flex-direction: column;
- align-items: center;
- justify-content: center;
- min-height:100vh; min-height:100dvh;
- text-align: center;
- }
- svg { width: 100%; height: auto; }
- .svg-container { width: 100%; max-width: 400px; display: inline-block; }
- .separator { letter-spacing: -9px; }
- hr { border: none; height: 2px; background-color: lightgrey; width: 100%; margin: 30px auto; }
- table { border-collapse: collapse; text-align: center; width: 100%; }
- td {
- padding: 5px 20px;
- border: 1px solid #dddddd;
- }
- .quiet, .links-container a { color: inherit; }
- .links-container { font-size: 24px; margin: 20px; }
- .links-container a { text-decoration: none; margin: 0 10px; }
- .product-photo { display: block; max-height: 640px; width: 100%; margin: 0 auto; }
- .faqtable dt { margin-bottom: 0.75em; font-style: italic; }
- .faqtable dd { margin-left: 0; margin-bottom: 2em; }
-
-h1 {
- font-family:'Lucida Console', monospace;
- font-size: 2.25rem;
- margin-bottom: 20px;
- text-align: left;
-}
-
-/* Styling for the navigation container */
-.navigation {
- display: flex;
- justify-content: left;
- align-items: center;
- gap: 15px;
- font-size: 1.25rem;
- font-family:'Lucida Console', monospace;
-}
-
-/* Styling for the links */
-.navigation a {
- text-decoration: none;
- color: #333; /* Dark gray for a clean, modern look */
- transition: color 0.3s;
-}
-
-.navigation a:hover {
- color: #007bff;
-}
-
-a {
- color: #007bff;
- text-decoration: none;
-}
-
-a:hover {
- color: #0056b3;
- text-decoration: underline;
-}
-
-
-/* Styling for the separator */
-.separator {
- color: #ccc; /* Lighter gray for separator */
- font-weight: normal;
- margin: 0 5px;
-}
- </style>
diff --git a/posts/sustainability-of-youtube.md b/posts/sustainability-of-youtube.md
@@ -1,100 +0,0 @@
-# The Sustainability of YouTube
-
-## Context
-
-I dislike using cloud services because they may discontinue my service [1] or they may do something stupid [2] that negatively impacts me. These concerns, along with concerns about privacy [3], have led me to keep information and content I care about away from cloud services. This does make me wonder, how many people would be distraught about the loss of their content if YouTube terminated their accounts? This is not the topic today, nor is it something I can easily answer, but it is something I wonder about and would like others to consider.
-
-Similarly, I am skeptical of 'free' services. It's incorrect to say "if something is free, you are the product" because charity does exist, but when it comes to Google, they aren't a charity. Their current model with YouTube is to have people upload videos to their site and show ads to some users when they watch said videos. There are also paid subscriptions, but their primary monetization comes from ads. An important point is they don't purge content on a regular basis, except in cases of ToS violations. As such, there is a (nearly) monotonically increasing function that describes the storage requirements of YouTube. This motivates my question below.
-
-## Question
-
-When will YouTube's storage costs exceed their revenue if they don't start purging old content, assuming their revenue does not increase over time?
-
-## How to Answer This Question
-
-We need the following information to answer this question:
-
-- What is YouTube's annual net profit?
-- How much data does YouTube store?
-- How much does data storage cost?
-
-## YouTube's Profit
-
-According to Alphabet's 2025 Q2 earnings release [4], YouTube ads made a revenue of $9.769 billion. Annualized, this is $39.076 billion, but this is only revenue, not net profit. If we assume the operating margin across Alphabet matches the operating margin of YouTube (32%), we find an approximate net profit of $12.50432 billion / year. Actual net profit could differ from this, but since we are concerned with how much data storage this could support, we don't need to factor in how this would be taxed.
-
-## Storage Needs
-
-### Total Videos
-
-YouTube states on their official blog there are over 20 million videos uploaded per day [5]. While I don't trust YouTube very much, and they don't have many incentives to be honest on this topic, they seem more trustworthy in this context than the slop factory sites as they are, in fact, the ones who are hosting the content. As such, I will accept this metric.
-
-### Average Video Size
-
-I wrote a python script that uses a curated list of popular Google Trends searches over the past few decades [6] to search YouTube for recently uploaded videos. I ran this script and compiled a list of ~7.65 million YouTube videos.
-
-Before continuing, I will list a few limitations of this approach:
-
-- YouTube likely imposes some amount of algorithmic filtering when sorting by 'recently uploaded'
-- The videos in question are all public (not inclusive of private/unlisted videos)
-- Less popular search terms may have a different distribution of video sizes
-
-These are the main flaws in my methodology, but any approach will be imperfect without being able to get the data directly from YouTube.
-
-Of these 7.65 million videos, I sampled 615,222 of them and queried YouTube using `yt-dlp` [7] to find all video resolutions and formats YouTube will serve.
-It seems unlikely to me that YouTube stores each of these resolutions on their servers, but I think it is very likely that YouTube is storing the highest resolution version they are willing to serve to users.
-
-Based on my findings, I propose a lower bound of ~396.17 MB / video, which assumes they are only storing the highest resolution version and all other versions are generated in real time via transcoding (I am confident this isn't the case, but it provides a nice lower bound). I also propose an upper bound of ~1.44 GB / video, which assumes they are storing every resolution and format for each video they are serving.
-
-All of the code used for this is available on my git server [8].
-
-### Annual Storage Increase
-
-Using my findings above about video size and YouTube's stated video upload rate, we find:
-
-Lower bound:
-
-- 7.923 PB / Day
-- 2.89 EB / Year
-
-Upper bound:
-
-- 28.895 PB / Day
-- 10.547 EB / Year
-
-Note: These values may vary depending on rounding, but they should be similar to what anyone else would find.
-
-## Storage Cost by Volume
-
-GCP currently charges $26 / month for 1 TB of standard multi-region, US based, cloud storage [9]. If we assume the same 32% profit margin as before, this would cost ~$17.68 / TB / month or $212.16 / TB / year. I don't know if this is high or low relative to what they actually pay. YouTube requires quick access to many of their videos, but many of their videos are likely retrieved infrequently. Additionally, it seems likely Alphabet's cloud storage margins are higher than the average margins across the organization. Additionally, these are only US storage prices so this could vary depending on the regions this data is being hosted in. In any case, I think this is a fair estimate.
-
-## Answer to the Question
-
-Given YouTube's approximated net profit of $12.50432 billion / year and an estimated cost of $212.16 / TB / year for cloud storage, we find their profits can support an additional ~58.94 EB of data.
-
-At the lower bound of 2.89 EB / year we find YouTube's storage costs will surpass their current profits in ~20.39 years.
-
-If we assume our upper bound of 10.547 EB / year we find YouTube's storage costs will surpass their current profits in ~5.59 years.
-
-## Conclusion
-
-These are very rough bounds, especially given how difficult it is to estimate the cost per TB / year for storage of this data given their retrieval needs, but we find that in ~5.59 - ~20.39 years, YouTube will be forced to start purging old content to remain profitable at their current profit rate.
-
-## Citations
-
-[1] [https://killedbygoogle.com/](https://killedbygoogle.com/)
-
-[2] [https://arstechnica.com/gadgets/2024/05/google-cloud-accidentally-nukes-customer-account-causes-two-weeks-of-downtime/](https://arstechnica.com/gadgets/2024/05/google-cloud-accidentally-nukes-customer-account-causes-two-weeks-of-downtime/)
-
-[3] [https://www.gnu.org/proprietary/proprietary-surveillance.html](https://www.gnu.org/proprietary/proprietary-surveillance.html)
-
-[4] [https://abc.xyz/assets/cc/27/3ada14014efbadd7a58472f1f3f4/2025q2-alphabet-earnings-release.pdf](https://abc.xyz/assets/cc/27/3ada14014efbadd7a58472f1f3f4/2025q2-alphabet-earnings-release.pdf)
-
-[5] [https://web.archive.org/web/20250911091711/https://blog.youtube/press/](https://web.archive.org/web/20250911091711/https://blog.youtube/press/)
-
-[6] [https://www.kaggle.com/datasets/dhruvildave/google-trends-dataset](https://www.kaggle.com/datasets/dhruvildave/google-trends-dataset)
-
-[7] [https://github.com/yt-dlp/yt-dlp](https://github.com/yt-dlp/yt-dlp)
-
-[8] [http://git.laack.co/blog/log.html](http://git.laack.co/blog/log.html)
-
-[9] [https://cloud.google.com/storage/pricing#multi-regions](https://cloud.google.com/storage/pricing#multi-regions)
diff --git a/scripts/convert.sh b/scripts/convert.sh
@@ -0,0 +1,12 @@
+#!/bin/bash
+
+rm posts/site/*.html
+FILES=$(ls posts/entries)
+
+for FILE in $FILES
+do
+ if [[ $FILE =~ \.md ]] ; then
+ echo $FILE
+ pandoc posts/entries/$FILE -o posts/site/$FILE.html -s --css=style.css
+ fi
+done
diff --git a/scripts/indexer.py b/scripts/indexer.py
@@ -22,7 +22,7 @@ class File:
return self.time < other.time
def __str__(self):
- return f"=> https://blog.laack.co/{str(self.file_name)} {self.title}"
+ return f"- [https://blog.laack.co/{str(self.file_name)}]({self.title.strip()})\n"
files = []
@@ -37,7 +37,7 @@ files.sort()
file_out = ""
for file in files:
- if file.extension == "gmi" and file.file_name != "posts/index.gmi":
+ if file.extension == "md" and file.file_name != "posts/index.md":
file_out += str(file)
@@ -50,6 +50,6 @@ site = f"""
{file_out}
"""
-with open("posts/index.gmi", "w") as file:
+with open("posts/index.md", "w") as file:
file.write(site)