blog

Unnamed repository; edit this file 'description' to name the repository.
Log | Files | Refs

commit ea9c85bee09565e32cfa16ef0226627064f5e12e
parent 1fcb4b3d7800e2d92b3f75e3c06f8f4bbf243d07
Author: Andrew Laack <andrew@laack.co>
Date:   Sun, 12 Oct 2025 09:11:25 -0500

Fixed links

Diffstat:
Mposts/entries/stop-collecting-user-data.md | 16+++++++---------
Mposts/site/feed.xml | 4++--
Mposts/site/stop-collecting-user-data.html | 15+++++++--------
3 files changed, 16 insertions(+), 19 deletions(-)

diff --git a/posts/entries/stop-collecting-user-data.md b/posts/entries/stop-collecting-user-data.md @@ -8,7 +8,7 @@ Sending the data of people who use applications you built, by default, for any p ## Why Does This Matter -This matters because humans are trusting. It abuses this trust by tracking unnecessary data about application usage because most humans implicitly assume this is not being done, and they often don't understand what the consequences of this tracking can be [1][2]. Additionally, it is unreasonable to expect users to look through your source code, all of your settings, and your docs to understand what data is being collected. If data is being collected, it should be obvious based on the purpose of the application, and if it is not obvious that it must be collected for the application to work, this should be made explicitly clear to users in the most obvious way possible. +This matters because humans are trusting. It abuses this trust by tracking unnecessary data about application usage because most humans implicitly assume this is not being done, and they often don't understand what the consequences of this tracking can be [1]. Additionally, it is unreasonable to expect users to look through your source code, all of your settings, and your docs to understand what data is being collected. If data is being collected, it should be obvious based on the purpose of the application, and if it is not obvious that it must be collected for the application to work, this should be made explicitly clear to users in the most obvious way possible. ## Counter Arguments @@ -22,18 +22,16 @@ No, it isn't. GitHub (bleh) issues exists, Discord (ick) exists, Matrix exists, ## Towards a Solution -Use applications that respect your privacy. If an application you are using collects your data and is not proprietary, it is quite likely there is a fork of it that strips out the data collection, see ungoogled-chromium [4] and LibreWolf [5] as examples. If one doesn't exist, consider making one. +Use applications that respect your privacy. If an application you are using collects your data and is not proprietary, it is quite likely there is a fork of it that strips out the data collection, see ungoogled-chromium [2] and LibreWolf [3] as examples. If one doesn't exist, consider making one. -If user-respecting alternatives don't exist and the application is proprietary, consider using WireShark [3] to see what domains the application is resolving. Once you find the data collection domains, add these domains to your /etc/hosts file or self-hosted DNS server (like a Pi-hole), and have them resolve to 0.0.0.0. This doesn't always work because the domain that is collecting data is sometimes used for to support the core functionallity of the application, but in an ideal world this should not be necessary as you shouldn't be using proprietary software to begin with. +If user-respecting alternatives don't exist and the application is proprietary, consider using WireShark [4] to see what domains the application is resolving. Once you find the data collection domains, add these domains to your /etc/hosts file or self-hosted DNS server (like a Pi-hole), and have them resolve to 0.0.0.0. This doesn't always work because the domain that is collecting data is sometimes used for to support the core functionallity of the application, but in an ideal world this should not be necessary as you shouldn't be using proprietary software to begin with. ## Citations -[1] - https://web.archive.org/web/20250929235200/https://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-was-pregnant-before-her-father-did/ +[1] - https://en.wikipedia.org/wiki/Cambridge_Analytica -[2] - https://en.wikipedia.org/wiki/Cambridge_Analytica +[2] - https://github.com/ungoogled-software/ungoogled-chromium -[3] - https://www.wireshark.org/download.html +[3] - https://librewolf.net/ -[4] - https://github.com/ungoogled-software/ungoogled-chromium - -[5] - https://librewolf.net/ +[4] - https://www.wireshark.org/download.html diff --git a/posts/site/feed.xml b/posts/site/feed.xml @@ -7,12 +7,12 @@ <language>en-us</language> <managingEditor>andrew@laack.co</managingEditor> <webMaster>andrew@laack.co</webMaster> -<lastBuildDate>Sun, 12 Oct 2025 01:44:35 -0500</lastBuildDate> +<lastBuildDate>Sun, 12 Oct 2025 09:11:05 -0500</lastBuildDate> <atom:link href="https://blog.laack.co/feed.xml" rel="self" type="application/rss+xml"/> <item> <title><![CDATA[Stop Collecting User Data]]></title> <link>https://blog.laack.co/stop-collecting-user-data.html</link> -<description><![CDATA[<h2 id="problem-statement">Problem Statement</h2><p>Sending the data of people who use applications you built, by default, for any purpose that is not strictly required for the application to function is morally wrong.</p><h2 id="why-does-this-matter">Why Does This Matter</h2><p>This matters because humans are trusting. It abuses this trust by tracking unnecessary data about application usage because most humans implicitly assume this is not being done, and they often don’t understand what the consequences of this tracking can be [1][2]. Additionally, it is unreasonable to expect users to look through your source code, all of your settings, and your docs to understand what data is being collected. If data is being collected, it should be obvious based on the purpose of the application, and if it is not obvious that it must be collected for the application to work, this should be made explicitly clear to users in the most obvious way possible.</p><h2 id="counter-arguments">Counter Arguments</h2><p><strong>But it is necessary to track errors so we can fix bugs and improve UX</strong></p><p>Yes, this is often the case. Does the Linux kernel collect logs? Yes! Do they upload them to a server for aggregation? No! This is how error logging should be done. Write your logs to a log file, but don’t automatically upload them to your servers. If a user has an issue that they would like addressed, they will let you know about it. If they don’t notice or don’t mind the issue, it’s their right to not report it. Some users may not want to deal with the hassle of uploading logs when things break, so they may prefer to have an option to automatically upload their logs. This is totally fine, but only if they are informed about what is being logged and it is an opt-in.</p><p><strong>But it is necessary to track usage to understand what users want</strong></p><p>No, it isn’t. GitHub (bleh) issues exists, Discord (ick) exists, Matrix exists, email exists, there are countless ways software projects crowd source improvements to their applications, but it should not be done using mass surveillance. I would argue it is acceptable to have an opt-in option to collect usage data, but I do wonder about the soundness of the minds of people who choose to opt-in to such surveillance.</p><h2 id="towards-a-solution">Towards a Solution</h2><p>Use applications that respect your privacy. If an application you are using collects your data and is not proprietary, it is quite likely there is a fork of it that strips out the data collection, see ungoogled-chromium [4] and LibreWolf [5] as examples. If one doesn’t exist, consider making one.</p><p>If user-respecting alternatives don’t exist and the application is proprietary, consider using WireShark [3] to see what domains the application is resolving. Once you find the data collection domains, add these domains to your /etc/hosts file or self-hosted DNS server (like a Pi-hole), and have them resolve to 0.0.0.0. This doesn’t always work because the domain that is collecting data is sometimes used for to support the core functionallity of the application, but in an ideal world this should not be necessary as you shouldn’t be using proprietary software to begin with.</p><h2 id="citations">Citations</h2><p>[1] - https://web.archive.org/web/20250929235200/https://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-was-pregnant-before-her-father-did/</p><p>[2] - https://en.wikipedia.org/wiki/Cambridge_Analytica</p><p>[3] - https://www.wireshark.org/download.html</p><p>[4] - https://github.com/ungoogled-software/ungoogled-chromium</p><p>[5] - https://librewolf.net/</p>]]></description> +<description><![CDATA[<h2 id="problem-statement">Problem Statement</h2><p>Sending the data of people who use applications you built, by default, for any purpose that is not strictly required for the application to function is morally wrong.</p><h2 id="why-does-this-matter">Why Does This Matter</h2><p>This matters because humans are trusting. It abuses this trust by tracking unnecessary data about application usage because most humans implicitly assume this is not being done, and they often don’t understand what the consequences of this tracking can be [1]. Additionally, it is unreasonable to expect users to look through your source code, all of your settings, and your docs to understand what data is being collected. If data is being collected, it should be obvious based on the purpose of the application, and if it is not obvious that it must be collected for the application to work, this should be made explicitly clear to users in the most obvious way possible.</p><h2 id="counter-arguments">Counter Arguments</h2><p><strong>But it is necessary to track errors so we can fix bugs and improve UX</strong></p><p>Yes, this is often the case. Does the Linux kernel collect logs? Yes! Do they upload them to a server for aggregation? No! This is how error logging should be done. Write your logs to a log file, but don’t automatically upload them to your servers. If a user has an issue that they would like addressed, they will let you know about it. If they don’t notice or don’t mind the issue, it’s their right to not report it. Some users may not want to deal with the hassle of uploading logs when things break, so they may prefer to have an option to automatically upload their logs. This is totally fine, but only if they are informed about what is being logged and it is an opt-in.</p><p><strong>But it is necessary to track usage to understand what users want</strong></p><p>No, it isn’t. GitHub (bleh) issues exists, Discord (ick) exists, Matrix exists, email exists, there are countless ways software projects crowd source improvements to their applications, but it should not be done using mass surveillance. I would argue it is acceptable to have an opt-in option to collect usage data, but I do wonder about the soundness of the minds of people who choose to opt-in to such surveillance.</p><h2 id="towards-a-solution">Towards a Solution</h2><p>Use applications that respect your privacy. If an application you are using collects your data and is not proprietary, it is quite likely there is a fork of it that strips out the data collection, see ungoogled-chromium [2] and LibreWolf [3] as examples. If one doesn’t exist, consider making one.</p><p>If user-respecting alternatives don’t exist and the application is proprietary, consider using WireShark [4] to see what domains the application is resolving. Once you find the data collection domains, add these domains to your /etc/hosts file or self-hosted DNS server (like a Pi-hole), and have them resolve to 0.0.0.0. This doesn’t always work because the domain that is collecting data is sometimes used for to support the core functionallity of the application, but in an ideal world this should not be necessary as you shouldn’t be using proprietary software to begin with.</p><h2 id="citations">Citations</h2><p>[1] - https://en.wikipedia.org/wiki/Cambridge_Analytica</p><p>[2] - https://github.com/ungoogled-software/ungoogled-chromium</p><p>[3] - https://librewolf.net/</p><p>[4] - https://www.wireshark.org/download.html</p>]]></description> <pubDate>Sun, 12 Oct 2025 00:00:00 -0500</pubDate> <guid>https://blog.laack.co/stop-collecting-user-data.html</guid> </item> diff --git a/posts/site/stop-collecting-user-data.html b/posts/site/stop-collecting-user-data.html @@ -30,20 +30,19 @@ <h2 id="problem-statement">Problem Statement</h2> <p>Sending the data of people who use applications you built, by default, for any purpose that is not strictly required for the application to function is morally wrong.</p> <h2 id="why-does-this-matter">Why Does This Matter</h2> -<p>This matters because humans are trusting. It abuses this trust by tracking unnecessary data about application usage because most humans implicitly assume this is not being done, and they often don’t understand what the consequences of this tracking can be [1][2]. Additionally, it is unreasonable to expect users to look through your source code, all of your settings, and your docs to understand what data is being collected. If data is being collected, it should be obvious based on the purpose of the application, and if it is not obvious that it must be collected for the application to work, this should be made explicitly clear to users in the most obvious way possible.</p> +<p>This matters because humans are trusting. It abuses this trust by tracking unnecessary data about application usage because most humans implicitly assume this is not being done, and they often don’t understand what the consequences of this tracking can be [1]. Additionally, it is unreasonable to expect users to look through your source code, all of your settings, and your docs to understand what data is being collected. If data is being collected, it should be obvious based on the purpose of the application, and if it is not obvious that it must be collected for the application to work, this should be made explicitly clear to users in the most obvious way possible.</p> <h2 id="counter-arguments">Counter Arguments</h2> <p><strong>But it is necessary to track errors so we can fix bugs and improve UX</strong></p> <p>Yes, this is often the case. Does the Linux kernel collect logs? Yes! Do they upload them to a server for aggregation? No! This is how error logging should be done. Write your logs to a log file, but don’t automatically upload them to your servers. If a user has an issue that they would like addressed, they will let you know about it. If they don’t notice or don’t mind the issue, it’s their right to not report it. Some users may not want to deal with the hassle of uploading logs when things break, so they may prefer to have an option to automatically upload their logs. This is totally fine, but only if they are informed about what is being logged and it is an opt-in.</p> <p><strong>But it is necessary to track usage to understand what users want</strong></p> <p>No, it isn’t. GitHub (bleh) issues exists, Discord (ick) exists, Matrix exists, email exists, there are countless ways software projects crowd source improvements to their applications, but it should not be done using mass surveillance. I would argue it is acceptable to have an opt-in option to collect usage data, but I do wonder about the soundness of the minds of people who choose to opt-in to such surveillance.</p> <h2 id="towards-a-solution">Towards a Solution</h2> -<p>Use applications that respect your privacy. If an application you are using collects your data and is not proprietary, it is quite likely there is a fork of it that strips out the data collection, see ungoogled-chromium [4] and LibreWolf [5] as examples. If one doesn’t exist, consider making one.</p> -<p>If user-respecting alternatives don’t exist and the application is proprietary, consider using WireShark [3] to see what domains the application is resolving. Once you find the data collection domains, add these domains to your /etc/hosts file or self-hosted DNS server (like a Pi-hole), and have them resolve to 0.0.0.0. This doesn’t always work because the domain that is collecting data is sometimes used for to support the core functionallity of the application, but in an ideal world this should not be necessary as you shouldn’t be using proprietary software to begin with.</p> +<p>Use applications that respect your privacy. If an application you are using collects your data and is not proprietary, it is quite likely there is a fork of it that strips out the data collection, see ungoogled-chromium [2] and LibreWolf [3] as examples. If one doesn’t exist, consider making one.</p> +<p>If user-respecting alternatives don’t exist and the application is proprietary, consider using WireShark [4] to see what domains the application is resolving. Once you find the data collection domains, add these domains to your /etc/hosts file or self-hosted DNS server (like a Pi-hole), and have them resolve to 0.0.0.0. This doesn’t always work because the domain that is collecting data is sometimes used for to support the core functionallity of the application, but in an ideal world this should not be necessary as you shouldn’t be using proprietary software to begin with.</p> <h2 id="citations">Citations</h2> -<p>[1] - https://web.archive.org/web/20250929235200/https://www.forbes.com/sites/kashmirhill/2012/02/16/how-target-figured-out-a-teen-girl-was-pregnant-before-her-father-did/</p> -<p>[2] - https://en.wikipedia.org/wiki/Cambridge_Analytica</p> -<p>[3] - https://www.wireshark.org/download.html</p> -<p>[4] - https://github.com/ungoogled-software/ungoogled-chromium</p> -<p>[5] - https://librewolf.net/</p> +<p>[1] - https://en.wikipedia.org/wiki/Cambridge_Analytica</p> +<p>[2] - https://github.com/ungoogled-software/ungoogled-chromium</p> +<p>[3] - https://librewolf.net/</p> +<p>[4] - https://www.wireshark.org/download.html</p> </body> </html>