fabien@debian2080ti:/media/fabien/slowdisk$ ls -lhS offline_prep/
total 341G
-rw-r--r-- 1 fabien fabien 103G Jul 6 2024 wikipedia_en_all_maxi_2024-01.zim-rw-r--r-- 1 fabien fabien 81G Apr 22 2023 gutenberg_mul_all_2023-04.zim-rw-r--r-- 1 fabien fabien 75G Jul 7 2024 stackoverflow.com_en_all_2023-11.zim-rw-r--r-- 1 fabien fabien 74G Mar 10 2024 planet-240304.osm.pbf-rw-r--r-- 1 fabien fabien 3.8G Oct 18 06:55 debian-13.1.0-amd64-DVD-1.iso-rw-r--r-- 1 fabien fabien 2.6G May 7 2023 ifixit_en_all_2023-04.zim-rw-r--r-- 1 fabien fabien 1.6G May 7 2023 developer.mozilla.org_en_all_2023-02.zim-rw-r--r-- 1 fabien fabien 931M May 7 2023 diy.stackexchange.com_en_all_2023-03.zim-rw-r--r-- 1 fabien fabien 808M Jun 5 2023 wikivoyage_en_all_maxi_2023-05.zim-rw-r--r-- 1 fabien fabien 296M Apr 30 2023 raspberrypi.stackexchange.com_en_all_2022-11.zim-rw-r--r-- 1 fabien fabien 131M May 7 2023 rapsberry_pi_docs_2023-01.zim-rw-r--r-- 1 fabien fabien 100M May 7 2023 100r-off-the-grid_en_2022-06.zim-rw-r--r-- 1 fabien fabien 61M May 7 2023 quantumcomputing.stackexchange.com_en_all_2022-11.zim-rw-r--r-- 1 fabien fabien 45M May 7 2023 computergraphics.stackexchange.com_en_all_2022-11.zim-rw-r--r-- 1 fabien fabien 37M May 7 2023 wordnet_en_all_2023-04.zim-rw-r--r-- 1 fabien fabien 23M Jul 17 2023 kiwix-tools_linux-armv6-3.5.0-1.tar.gz-rw-r--r-- 1 fabien fabien 16M Oct 6 21:32 be-stib-gtfs.zip-rw-r--r-- 1 fabien fabien 3.8M Oct 6 21:32 be-sncb-gtfs.zip-rw-r--r-- 1 fabien fabien 2.3M May 7 2023 termux_en_all_maxi_2022-12.zim-rw-r--r-- 1 fabien fabien 1.9M May 7 2023 kiwix-firefox_3.8.0.xpi
but if you want the easier version just get Kiwix on whatever device in front of you right now (yes, even mobile phone assuming you have the space) then get whatever content you need.
The point though is having such a repository takes minutes. If you don’t have the space, buy a 512Go microSD for 50EUR then put that on, stuff it in a drawer then move on. If you want to every 3 months or whenever you feel like it, updated it.
TL;DR: takes longer to write such a meme than actually do it.
Watch out for flash data corruption. Lots of cheap flash (USB sticks, SD cards, SSDs) lose data after just a few years of offline storage. Something something quantum tunnel bullshit, iirc.
So either look for media that guarantee long cold storage retention (lots of businesses need to keep shit for 10 years for tax reasons), or occasionally plug it in and let do the housekeeping.
User older flash tech can be useful here. You might not always need the highest density storage if you want to maintain files for a long time. Getting stuff built in a much larger process node makes for a much more stable form of storage.
It’s more that flash NAND uses a small electric charge to keep the NAND gates in the correct configuration. Over time, that charge dissipates. If you power the storage device every once in a while, you minimize these chances.
Here’s a video explaining why it happens to Wii U’s after being powered off for a while. https://youtu.be/JHME4zLs6Qs
Thanks but even though it’s on a plugged HDD I don’t even care for any of that data. What I mean is that none of that data is sensitive. It might be useful, potentially, but it’s not unique. What I mean is that if somehow my .zim file for Wikipedia was corrupted I could download it again from https://library.kiwix.org/#lang=eng&category=wikipedia or elsewhere in ~30min (just checked).
What I’m trying to highlight here is more the process than the actual outcome.
TL;DR: yes, if one is actually serious about just getting and storing, they should verify periodically if the data is indeed fine. What I do want to highlight though is to first know how to do it at all. Anyway, you are right that for a proper solution on the long run one must understand how (cold) storage actually works. My heuristic is that it’s like can food (which I don’t use much), it might last a while, but not forever.
It can be but not to me. To me the point is to test what’s actually feasible and usable. It can be Wikipedia on my HDD but it could also be SO on a microSD or a RPi … or it could be something totally different on another piece of hardware with another piece of storage. It will depend on the context.
So again, sure, having the data itself feels nice but in practice I never really needed it. If tomorrow my HDD would die I would shrug. If tomorrow Kiwix library wouldn’t work anymore, I’d be disappointed but I could rely on .zim file elsewhere, e.g. on torrent trackers.
IMHO the point isn’t files, the point is usable knowledge.
Edit : to be clear this isn’t philosophy, you can see exactly what I mean and even HOW I do it (and even when) with the edits of my public wiki or my git repositories.
-rw-r--r--1fabienfabien103GJul62024 wikipedia_en_all_maxi_2024-01.zim# encyclopedia Wikipedia English with images and more-rw-r--r--1fabienfabien81GApr222023 gutenberg_mul_all_2023-04.zim# Project Gutenberg, book collection in multiple languages-rw-r--r--1fabienfabien75GJul72024 stackoverflow.com_en_all_2023-11.zim# StackOverflow, programming questions and answers-rw-r--r--1fabienfabien74GMar102024 planet-240304.osm.pbf# OpenStreetMap low resolution for the whole World-rw-r--r--1fabienfabien3.8GOct1806:55debian-13.1.0-amd64-DVD-1.iso# Debian base ISO-rw-r--r--1fabienfabien2.6GMay72023 ifixit_en_all_2023-04.zim# iFixit colection of guides to fix appliances-rw-r--r--1fabienfabien1.6GMay72023 developer.mozilla.org_en_all_2023-02.zim# Web development documentation-rw-r--r--1fabienfabien931MMay72023 diy.stackexchange.com_en_all_2023-03.zim# Do It Yourself Q&A-rw-r--r--1fabienfabien808MJun52023 wikivoyage_en_all_maxi_2023-05.zim# WikiVoyage, the version of Wikipedia for traveling-rw-r--r--1fabienfabien296MApr302023 raspberrypi.stackexchange.com_en_all_2022-11.zim# Raspberry Pi Q&A-rw-r--r--1fabienfabien131MMay72023 rapsberry_pi_docs_2023-01.zim# Rasspberry Pi documentation-rw-r--r--1fabienfabien100MMay72023 100r-off-the-grid_en_2022-06.zim# Off the grid documents-rw-r--r--1fabienfabien61MMay72023 quantumcomputing.stackexchange.com_en_all_2022-11.zim# Quantum computer Q&A-rw-r--r--1fabienfabien45MMay72023 computergraphics.stackexchange.com_en_all_2022-11.zim# Computer graphics Q&A-rw-r--r--1fabienfabien37MMay72023 wordnet_en_all_2023-04.zim# Graph of words in English-rw-r--r--1fabienfabien23MJul172023 kiwix-tools_linux-armv6-3.5.0-1.tar.gz# Kiwix to read .zim files-rw-r--r--1fabienfabien16MOct621:32be-stib-gtfs.zip# public transport database in Brussels, Belgium-rw-r--r--1fabienfabien3.8MOct621:32be-sncb-gtfs.zip# train transport database in Belgium-rw-r--r--1fabienfabien2.3MMay72023 termux_en_all_maxi_2022-12.zim# Termux, Linux tooling on Android, documentation in English-rw-r--r--1fabienfabien1.9MMay72023 kiwix-firefox_3.8.0.xpi# Kiwix Web Extension for the Firefox browser
FWIW :
fabien@debian2080ti:/media/fabien/slowdisk$ ls -lhS offline_prep/ total 341G -rw-r--r-- 1 fabien fabien 103G Jul 6 2024 wikipedia_en_all_maxi_2024-01.zim -rw-r--r-- 1 fabien fabien 81G Apr 22 2023 gutenberg_mul_all_2023-04.zim -rw-r--r-- 1 fabien fabien 75G Jul 7 2024 stackoverflow.com_en_all_2023-11.zim -rw-r--r-- 1 fabien fabien 74G Mar 10 2024 planet-240304.osm.pbf -rw-r--r-- 1 fabien fabien 3.8G Oct 18 06:55 debian-13.1.0-amd64-DVD-1.iso -rw-r--r-- 1 fabien fabien 2.6G May 7 2023 ifixit_en_all_2023-04.zim -rw-r--r-- 1 fabien fabien 1.6G May 7 2023 developer.mozilla.org_en_all_2023-02.zim -rw-r--r-- 1 fabien fabien 931M May 7 2023 diy.stackexchange.com_en_all_2023-03.zim -rw-r--r-- 1 fabien fabien 808M Jun 5 2023 wikivoyage_en_all_maxi_2023-05.zim -rw-r--r-- 1 fabien fabien 296M Apr 30 2023 raspberrypi.stackexchange.com_en_all_2022-11.zim -rw-r--r-- 1 fabien fabien 131M May 7 2023 rapsberry_pi_docs_2023-01.zim -rw-r--r-- 1 fabien fabien 100M May 7 2023 100r-off-the-grid_en_2022-06.zim -rw-r--r-- 1 fabien fabien 61M May 7 2023 quantumcomputing.stackexchange.com_en_all_2022-11.zim -rw-r--r-- 1 fabien fabien 45M May 7 2023 computergraphics.stackexchange.com_en_all_2022-11.zim -rw-r--r-- 1 fabien fabien 37M May 7 2023 wordnet_en_all_2023-04.zim -rw-r--r-- 1 fabien fabien 23M Jul 17 2023 kiwix-tools_linux-armv6-3.5.0-1.tar.gz -rw-r--r-- 1 fabien fabien 16M Oct 6 21:32 be-stib-gtfs.zip -rw-r--r-- 1 fabien fabien 3.8M Oct 6 21:32 be-sncb-gtfs.zip -rw-r--r-- 1 fabien fabien 2.3M May 7 2023 termux_en_all_maxi_2022-12.zim -rw-r--r-- 1 fabien fabien 1.9M May 7 2023 kiwix-firefox_3.8.0.xpi
but if you want the easier version just get Kiwix on whatever device in front of you right now (yes, even mobile phone assuming you have the space) then get whatever content you need.
If need a bit of help I recorded TechSovereignty at home, episode 11 - Offline Wikipedia, Kiwix and checksums with a friend just 3 weeks ago.
I also wrote randomly update https://fabien.benetou.fr/Content/Vademecum and coded https://git.benetou.fr/utopiah/offline-octopus but tbh KDE-Connect is much better now.
The point though is having such a repository takes minutes. If you don’t have the space, buy a 512Go microSD for 50EUR then put that on, stuff it in a drawer then move on. If you want to every 3 months or whenever you feel like it, updated it.
TL;DR: takes longer to write such a meme than actually do it.
Watch out for flash data corruption. Lots of cheap flash (USB sticks, SD cards, SSDs) lose data after just a few years of offline storage. Something something quantum tunnel bullshit, iirc.
So either look for media that guarantee long cold storage retention (lots of businesses need to keep shit for 10 years for tax reasons), or occasionally plug it in and let do the housekeeping.
User older flash tech can be useful here. You might not always need the highest density storage if you want to maintain files for a long time. Getting stuff built in a much larger process node makes for a much more stable form of storage.
It’s more that flash NAND uses a small electric charge to keep the NAND gates in the correct configuration. Over time, that charge dissipates. If you power the storage device every once in a while, you minimize these chances.
Here’s a video explaining why it happens to Wii U’s after being powered off for a while. https://youtu.be/JHME4zLs6Qs
Thanks but even though it’s on a plugged HDD I don’t even care for any of that data. What I mean is that none of that data is sensitive. It might be useful, potentially, but it’s not unique. What I mean is that if somehow my
.zim
file for Wikipedia was corrupted I could download it again from https://library.kiwix.org/#lang=eng&category=wikipedia or elsewhere in ~30min (just checked).What I’m trying to highlight here is more the process than the actual outcome.
TL;DR: yes, if one is actually serious about just getting and storing, they should verify periodically if the data is indeed fine. What I do want to highlight though is to first know how to do it at all. Anyway, you are right that for a proper solution on the long run one must understand how (cold) storage actually works. My heuristic is that it’s like can food (which I don’t use much), it might last a while, but not forever.
I thought the point of backing stuff up was to have things in case just downloading it again isn’t a viable option?
It can be but not to me. To me the point is to test what’s actually feasible and usable. It can be Wikipedia on my HDD but it could also be SO on a microSD or a RPi … or it could be something totally different on another piece of hardware with another piece of storage. It will depend on the context.
So again, sure, having the data itself feels nice but in practice I never really needed it. If tomorrow my HDD would die I would shrug. If tomorrow Kiwix library wouldn’t work anymore, I’d be disappointed but I could rely on
.zim
file elsewhere, e.g. on torrent trackers.IMHO the point isn’t files, the point is usable knowledge.
Edit : to be clear this isn’t philosophy, you can see exactly what I mean and even HOW I do it (and even when) with the edits of my public wiki or my git repositories.
Whoa, what are all those things you have?
Commenting inline :
-rw-r--r-- 1 fabien fabien 103G Jul 6 2024 wikipedia_en_all_maxi_2024-01.zim # encyclopedia Wikipedia English with images and more -rw-r--r-- 1 fabien fabien 81G Apr 22 2023 gutenberg_mul_all_2023-04.zim # Project Gutenberg, book collection in multiple languages -rw-r--r-- 1 fabien fabien 75G Jul 7 2024 stackoverflow.com_en_all_2023-11.zim # StackOverflow, programming questions and answers -rw-r--r-- 1 fabien fabien 74G Mar 10 2024 planet-240304.osm.pbf # OpenStreetMap low resolution for the whole World -rw-r--r-- 1 fabien fabien 3.8G Oct 18 06:55 debian-13.1.0-amd64-DVD-1.iso # Debian base ISO -rw-r--r-- 1 fabien fabien 2.6G May 7 2023 ifixit_en_all_2023-04.zim # iFixit colection of guides to fix appliances -rw-r--r-- 1 fabien fabien 1.6G May 7 2023 developer.mozilla.org_en_all_2023-02.zim # Web development documentation -rw-r--r-- 1 fabien fabien 931M May 7 2023 diy.stackexchange.com_en_all_2023-03.zim # Do It Yourself Q&A -rw-r--r-- 1 fabien fabien 808M Jun 5 2023 wikivoyage_en_all_maxi_2023-05.zim # WikiVoyage, the version of Wikipedia for traveling -rw-r--r-- 1 fabien fabien 296M Apr 30 2023 raspberrypi.stackexchange.com_en_all_2022-11.zim # Raspberry Pi Q&A -rw-r--r-- 1 fabien fabien 131M May 7 2023 rapsberry_pi_docs_2023-01.zim # Rasspberry Pi documentation -rw-r--r-- 1 fabien fabien 100M May 7 2023 100r-off-the-grid_en_2022-06.zim # Off the grid documents -rw-r--r-- 1 fabien fabien 61M May 7 2023 quantumcomputing.stackexchange.com_en_all_2022-11.zim # Quantum computer Q&A -rw-r--r-- 1 fabien fabien 45M May 7 2023 computergraphics.stackexchange.com_en_all_2022-11.zim # Computer graphics Q&A -rw-r--r-- 1 fabien fabien 37M May 7 2023 wordnet_en_all_2023-04.zim # Graph of words in English -rw-r--r-- 1 fabien fabien 23M Jul 17 2023 kiwix-tools_linux-armv6-3.5.0-1.tar.gz # Kiwix to read .zim files -rw-r--r-- 1 fabien fabien 16M Oct 6 21:32 be-stib-gtfs.zip # public transport database in Brussels, Belgium -rw-r--r-- 1 fabien fabien 3.8M Oct 6 21:32 be-sncb-gtfs.zip # train transport database in Belgium -rw-r--r-- 1 fabien fabien 2.3M May 7 2023 termux_en_all_maxi_2022-12.zim # Termux, Linux tooling on Android, documentation in English -rw-r--r-- 1 fabien fabien 1.9M May 7 2023 kiwix-firefox_3.8.0.xpi # Kiwix Web Extension for the Firefox browser