It's A Digital Disease!

23 readers
1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 2 years ago
MODERATORS
1
 
 
The original post: /r/datahoarder by /u/masturbaiter696969 on 2025-07-12 04:44:12.

I am currently using two 1.5tb hard drives from 2009. They are very slow and starting to grow in bad sector count. Running sha256sum on these takes forever and I feel like running that also wears them out quicker.

I am thinking about buying some bx500 ssds or western digital blue hard drives.

The computers will be on almost all the time. While the drives are not going to be mounted all the time, I think the power supply will still supply them with power, so I don't have to worry about ssd data retention, is that correct?

I will be running sha256sum frequently, and from my understanding ssds are not really worn out by reads while hard drives are, is that correct?

I don't have money issues so i'm leaning towards ssds. Is there any downsides to ssds in my situation?

2
 
 
The original post: /r/datahoarder by /u/SUCK-PIT on 2025-07-12 00:21:07.
3
 
 
The original post: /r/datahoarder by /u/First_Musician6260 on 2025-07-12 00:13:59.
4
 
 
The original post: /r/datahoarder by /u/zilexa on 2025-07-11 23:59:00.

Just wondering, I live Prowlarr + Sonarr + Radarr + QB. But is there a more simplified, potential all-in-one app ? Where you can simply add shows/movies you want to watch. And don't need to go find public trackers on Prowlarr first, integrate the apps with each other through their API keys and with their local IP addresses etc.

I love the NZB360 app for Android (a very friendly umbrella GUI over all *arr + QB) and I was just wondering why an app like that doesn't exist that does it all..

5
 
 
The original post: /r/datahoarder by /u/Ciapekq on 2025-07-11 22:27:59.

.

6
 
 
The original post: /r/datahoarder by /u/jaycenprogress on 2025-07-11 20:52:03.

It is Amazon Prime Day. I have the Synology in my cart for $354 and 2 16TB IronWolf Pro HDDs $269/each in my cart. I'd like advice on whether or not to pull the trigger. I'd love if all my data scattered across mutliple 1tb micro sd cards was just a central library in one place. I have up to or around 10TB worth of data that isn't backed up but would ensure to, to cloud services once I decide to purchase or not purchase the NAS. I've lots of media files ranging from movies, tv, anime, photos, music, etc. Has any one ever felt regret buying one? Did it improve data hoarding for you?

7
 
 
The original post: /r/datahoarder by /u/DiodeInc on 2025-07-11 20:46:05.
8
 
 
The original post: /r/datahoarder by /u/tryingtobecheeky on 2025-07-11 20:24:04.

I'm just genuinely curious if many people are gathering their data on paper.

9
 
 
The original post: /r/datahoarder by /u/abyssea on 2025-07-11 20:23:46.

One of the disks in my Unraid server is giving off these types of errors, meaning that portion of the drive is not accessible:

Jul 9 21:13:30 Tower kernel: sd 1:0:16:0: [sdr] tag#239 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=DRIVER_OK cmd_age=7s

Jul 9 21:13:30 Tower kernel: sd 1:0:16:0: [sdr] tag#239 Sense Key : 0x3 [current] [descriptor]

Jul 9 21:13:30 Tower kernel: sd 1:0:16:0: [sdr] tag#239 ASC=0x11 ASCQ=0x0

Jul 9 21:13:30 Tower kernel: sd 1:0:16:0: [sdr] tag#239 CDB: opcode=0x88 88 00 00 00 00 03 7f 2f 26 10 00 00 02 00 00 00

Jul 9 21:13:30 Tower kernel: critical medium error, dev sdr, sector 15018698736 op 0x0:(READ) flags 0x0 phys_seg 4 prio class 0

DiskSpeed reports back with:

|| || |Temperature Celsius||26| |Raw Read Error Rate||0| |Spin Up Time||9833| |Start Stop Count||903| |Reallocated Sector Ct||0| |Seek Error Rate||0| |Power On Hours||27295 [3 Years, 42 Days, 7 hours]| |Spin Retry Count||0| |Calibration Retry Count||0| |Power Cycle Count||20| |Power-Off Retract Count||17| |Load Cycle Count||897| |Reallocated Event Count||0| |Current Pending Sector||0| |Offline Uncorrectable||0| |UDMA CRC Error Count||0| |Multi Zone Error Rate||1094|

I know some data systems will mark bad sectors and avoid them, meaning less of the drive is useable but the drive isn't dead in the water. I've moved all the data from the drive on to another drive and performed an extended SMART test with Unraid, which came back without any issues.

Device Statistics (GP Log 0x04)

Page Offset Size Value Flags Description

0x01 ===== = = === == General Statistics (rev 1) ==

0x01 0x008 4 20


Lifetime Power-On Resets

0x01 0x010 4 27291


Power-on Hours

0x01 0x018 6 40039298553


Logical Sectors Written

0x01 0x020 6 42197995


Number of Write Commands

0x01 0x028 6 296149816379


Logical Sectors Read

0x01 0x030 6 417669825


Number of Read Commands

0x01 0x038 6 3758319488


Date and Time TimeStamp

0x03 ===== = = === == Rotating Media Statistics (rev 1) ==

0x03 0x008 4 21972


Spindle Motor Power-on Hours

0x03 0x010 4 21933


Head Flying Hours

0x03 0x018 4 914


Head Load Events

0x03 0x020 4 0


Number of Reallocated Logical Sectors

0x03 0x028 4 48008


Read Recovery Attempts

0x03 0x030 4 0


Number of Mechanical Start Failures

0x03 0x038 4 8


Number of Realloc. Candidate Logical Sectors

0x03 0x040 4 17


Number of High Priority Unload Events

0x04 ===== = = === == General Errors Statistics (rev 1) ==

0x04 0x008 4 7


Number of Reported Uncorrectable Errors

0x04 0x010 4 0


Resets Between Cmd Acceptance and Completion

0x05 ===== = = === == Temperature Statistics (rev 1) ==

0x05 0x008 1 34


Current Temperature

0x05 0x010 1 29


Average Short Term Temperature

0x05 0x018 1 24


Average Long Term Temperature

0x05 0x020 1 45


Highest Temperature

0x05 0x028 1 15


Lowest Temperature

0x05 0x030 1 40


Highest Average Short Term Temperature

0x05 0x038 1 18


Lowest Average Short Term Temperature

0x05 0x040 1 32


Highest Average Long Term Temperature

0x05 0x048 1 22


Lowest Average Long Term Temperature

0x05 0x050 4 0


Time in Over-Temperature

0x05 0x058 1 65


Specified Maximum Operating Temperature

0x05 0x060 4 0


Time in Under-Temperature

0x05 0x068 1 0


Specified Minimum Operating Temperature

0x06 ===== = = === == Transport Statistics (rev 1) ==

0x06 0x008 4 35


Number of Hardware Resets

0x06 0x010 4 0


Number of ASR Events

0x06 0x018 4 0


Number of Interface CRC Errors

And

SATA Phy Event Counters (GP Log 0x11)

ID Size Value Description

0x0001 2 0 Command failed due to ICRC error

0x0002 2 0 R_ERR response for data FIS

0x0003 2 0 R_ERR response for device-to-host data FIS

0x0004 2 0 R_ERR response for host-to-device data FIS

0x0005 2 0 R_ERR response for non-data FIS

0x0006 2 0 R_ERR response for device-to-host non-data FIS

0x0007 2 0 R_ERR response for host-to-device non-data FIS

0x0008 2 0 Device-to-host non-data FIS retries

0x0009 2 0 Transition from drive PhyRdy to drive PhyNRdy

0x000a 2 1 Device-to-host register FISes sent due to a COMRESET

0x000b 2 0 CRC errors within host-to-device FIS

0x000d 2 0 Non-CRC errors within host-to-device FIS

0x000f 2 0 R_ERR response for host-to-device data FIS, CRC

0x0012 2 0 R_ERR response for host-to-device non-data FIS, CRC

0x8000 4 908014 Vendor specific

Because the array is reporting 288 errors from the device, I'm not sure if the drive should be replaced, considering the other results. Looking for advice, thanks.

10
 
 
The original post: /r/datahoarder by /u/CarcajadaArtificial on 2025-07-11 20:03:57.

Hello, I just started using ArchiveBox to store local copies of my bookmarks and articles. Frequently I would store two different pages from the same site that would have repeated images, of course it would be better to not keep this kinds of duplicates. I suppose this is a relatively common concern but couldn't find anything about this in the docs. I also suppose that not all download formats would handle this situation the same way, I was using SingleFile which I suddenly realized that it probably wouldn't be too optimized for this. What would be your recommendation for this?

Thank you

11
 
 
The original post: /r/datahoarder by /u/SmokingHensADAN on 2025-07-11 18:14:38.

Hey guys, I need to give a quote for 500x 5TB hard drives. I am already in the process of getting reseller status with a large manufacturer. This is falling into my lap, and I'm in a different industry. What are my options for getting that many hard drives at the best price? I know some, but I would prefer to act like I'm clueless. I am sure Ill learn some stuff..

12
 
 
The original post: /r/datahoarder by /u/Even-Mechanic-7182 on 2025-07-11 13:10:32.
13
 
 
The original post: /r/datahoarder by /u/manzurfahim on 2025-07-11 09:43:52.
14
 
 
The original post: /r/datahoarder by /u/cyrbevos on 2025-07-11 09:40:40.

After 10+ years of data hoarding (currently sitting on ~80TB across multiple systems), had a wake-up call about backup encryption key protection that might interest this community.

The Problem: Most of us encrypt our backup drives - whether it's borg/restic repositories, encrypted external drives, or cloud backups. But we're creating a single point of failure with the encryption keys/passphrases. Lose that key = lose everything. House fire, hardware wallet failure, forgotten password location = decades of collected data gone forever.

Links:

Context: My Data Hoarding Setup

What I'm protecting:

  • 25TB Borg repository (daily backups going back 8 years)
  • 15TB of media archives (family photos/videos, rare documentaries, music)
  • 20TB miscellaneous data hoard (software archives, technical documentation, research papers)
  • 18TB cloud backup encrypted with duplicity
  • Multiple encrypted external drives for offsite storage

The encryption key problem: Each repository is protected by a strong passphrase, but those passphrases were stored in a password manager + written on paper in a fire safe. Single points of failure everywhere.

Mathematical Solution: Shamir's Secret Sharing

Our team built a tool that mathematically splits encryption keys so you need K out of N pieces to reconstruct them, but fewer pieces reveal nothing:

bash
# Split your borg repo passphrase into 5 pieces, need any 3 to recover
fractum encrypt borg-repo-passphrase.txt --threshold 3 --shares 5 --label "borg-main"

# Same for other critical passphrases
fractum encrypt duplicity-key.txt --threshold 3 --shares 5 --label "cloud-backup"

Why this matters for data hoarders:

  • Disaster resilience: House fire destroys your safe + computer, but shares stored with family/friends/bank let you recover
  • No single point of failure: Can't lose access because one storage location fails
  • Inheritance planning: Family can pool shares to access your data collection after you're gone
  • Geographic distribution: Spread shares across different locations/people

Real-World Data Hoarder Scenarios

Scenario 1: The Borg Repository Your 25TB borg repository spans 8 years of incremental backups. Passphrase gets corrupted on your password manager + house fire destroys the paper backup = everything gone.

With secret sharing: Passphrase split across 5 locations (bank safe, family members, cloud storage, work, attorney). Need any 3 to recover. Fire only affects 1-2 locations.

Scenario 2: The Media Archive Decades of family photos/videos on encrypted drives. You forget where you wrote down the LUKS passphrase, main storage fails.

With secret sharing: Drive encryption key split so family members can coordinate recovery even if you're not available.

Scenario 3: The Cloud Backup Your duplicity-encrypted cloud backup protects everything, but the encryption key is only in one place. Lose it = lose access to cloud copies of your entire hoard.

With secret sharing: Cloud backup key distributed so you can always recover, even if primary systems fail.

Implementation for Data Hoarders

What gets protected:

  • Borg/restic repository passphrases
  • LUKS/BitLocker volume keys for archive drives
  • Cloud backup encryption keys (rclone crypt, duplicity, etc.)
  • Password manager master passwords/recovery keys
  • Any other "master keys" that protect your data hoard

Distribution strategy for hoarders:

bash
# Example: 3-of-5 scheme for main backup key
# Share 1: Bank safety deposit box
# Share 2: Parents/family in different state  
# Share 3: Best friend (encrypted USB)
# Share 4: Work safe/locker
# Share 5: Attorney/professional storage

Each share is self-contained - includes the recovery software, so even if GitHub disappears, you can still decrypt your data.

Technical Details

Pure Python implementation:

  • Runs completely offline (air-gapped security)
  • No network dependencies during key operations
  • Cross-platform (Windows/macOS/Linux)
  • Uses industry-standard AES-256-GCM + Shamir's Secret Sharing

Memory protection:

  • Secure deletion of sensitive data from RAM
  • No temporary files containing keys
  • Designed for paranoid security requirements

File support:

  • Protects any file type/size
  • Works with text files containing passphrases
  • Can encrypt entire keyfiles, recovery seeds, etc.

Questions for r/DataHoarder:

  1. Backup strategies: How do you currently protect your backup encryption keys?
  2. Long-term thinking: What's your plan if you're not available and family needs to access archives?
  3. Geographic distribution: Anyone else worry about correlated failures (natural disasters, etc.)?
  4. Other use cases: What other "single point of failure" problems do data hoarders face?

Why I'm Sharing This

Almost lost access to 8 years of borg backups when our main password manager got corrupted and couldn't remember where we'd written the paper backup. Spent a terrifying week trying to recover it.

Realized that as data hoarders, we spend so much effort on redundant storage but often ignore redundant access to that storage. Mathematical secret sharing fixes this gap.

The tool is open source because losing decades of collected data is a problem too important to depend on any company staying in business.

As a sysadmin/SRE who manages backup systems professionally, I've seen too many cases where people lose access to years of data because of encryption key failures. Figured this community would appreciate a solution our team built that addresses the "single point of failure" problem with backup encryption keys.

The Problem: Most of us encrypt our backup drives - whether it's borg/restic repositories, encrypted external drives, or cloud backups. But we're creating a single point of failure with the encryption keys/passphrases. Lose that key = lose everything. House fire, hardware wallet failure, forgotten password location = decades of collected data gone forever.

Links:

Context: What I've Seen in Backup Management

Professional experience with backup failures:

  • Companies losing access to encrypted backup repositories when key custodian leaves
  • Families unable to access deceased relative's encrypted photo/video collections
  • Data recovery scenarios where encryption keys were the missing piece
  • Personal friends who lost decades of digital memories due to forgotten passphrases

Common data hoarder setups I've helped with:

  • Large borg/restic repositories (10-100TB+)
  • Encrypted external drive collections
  • Cloud backup encryption keys (duplicity, rclone crypt)
  • Media archives with LUKS/BitLocker encryption
  • Password manager master passwords protecting everything else

The encryption key problem: Each repository is protected by a strong passphrase, but those passphrases were stored in a password manager + written on paper in a fire safe. Single points of failure everywhere.

Mathematical Solution: Shamir's Secret Sharing

Our team built a tool that mathematically splits encryption keys so you need K out of N pieces to reconstruct them, but fewer pieces reveal nothing:

bash# Split your borg repo passphrase into 5 pieces, need any 3 to recover
fractum encrypt borg-repo-passphrase.txt --threshold 3 --shares 5 --label "borg-main"

# Same for other critical passphrases
fractum encrypt duplicity-key.txt --threshold 3 --shares 5 --label "cloud-backup"

Why this matters for data hoarders:

  • Disaster resilience: House fire destroys your safe + computer, but shares stored with family/friends/bank let you recover
  • No single point of failure: Can't lose access because one storage location fails
  • Inheritance planning: Family can pool shares to access your data collection after you're gone
  • Geographic distribution: Spread shares across different locations/people

Real-World Data Hoarder Scenarios

Scenario 1: The Borg Repository Your 25TB borg repository spans 8 years of incremental backups. Passphrase gets corrupted on your password manager + house fire destroys the paper backup = everything gone.

With secret sharing: Passphrase split across 5 locations (bank safe, family members, cloud storage, work, attorney). Need any 3 to recover. Fire only affects 1-2 locations.

Scenario 2: The Media Archive Decades of family photos/videos on encrypted drives. You forget where you wrote down the LUKS passphrase, main storage fails.

With secret sharing: Drive encryption key split so family members can coordinate recovery even if you're not available.

Scenario 3: The Cloud Backup Your duplicity-encrypted cloud backup protects everything, but the encryption key is only in one place. Lose it = lose access to cloud copies of your entire hoard.

With secret sharing: Cloud backup key distributed so you can always recover, even if primary systems fail.

Imp...


Content cut off. Read original on https://old.reddit.com/r/DataHoarder/comments/1lx2my0/protecting_backup_encryption_keys_for_your_data/

15
 
 
The original post: /r/datahoarder by /u/Gamesarefun97 on 2025-07-11 04:29:58.

I have been backing up all of my CD's recently, using fr:ac to convert them to .flac files. I've encountered Enhanced Audio CD's, where they now contain both data and audio tracks. For my dvd collection I have used dvd decrypter to convert all of them into .iso files, as I want to be able to emulate inserting the disk, but I'm not sure how to back up this type of CD. I would prefer for it to end up as something like an iso file, but I haven't been able to find much about the best way to rip these.

Any help is greatly appreciated.

16
 
 
The original post: /r/datahoarder by /u/5meohd on 2025-07-11 00:21:45.

Hello!

I am working on a project to combine the collections of myself and a local irl friend. Between the two of us we have over 14,000 discs. Counting for overlapping titles its likely closer to 12,000.

So far I have just been testing PLEX, Make MKV, a 20TB external drive, an old 2015 MBP and various playback devices including my Shield Pro 2019.

We have been successful with ripping and playback of our discs, including UHD discs. We are keeping everything lossless, including supplements, commentaries, etc... Im a videophile with a nice Sony OLED and hes a film geek that actually works in the industry of disc bonus feature production. So between the two of us, we just cant budge on file size. In fact, we are most excited about the project giving us convenient access to compiling various versions and imports of the same film into one folder. So exciting!

My question for you experts -

If Im willing to start with a budget of $2K, can I build something quality that can just be expanded every year as more funds become available? Maybe start with some kind of DIY NAS with 8 bays and PCIe expansion capablities? I havent built a PC since Windows 7 in 2010 and Ive never built a server.

Outside of "youre in over your head, give up", I appreciate any and all thoughts or ideas!!

With gratitude!

17
 
 
The original post: /r/datahoarder by /u/Intelligent_Series46 on 2025-07-10 20:27:06.
18
 
 
The original post: /r/datahoarder by /u/Phil_Goud on 2025-07-10 15:29:51.

Hi everyone !

Mostly lurker and little data hoarder here

I was fed up with the complexity of Tdarr and other softwares to keep the size of my (legal) videos on check.

So I did that started as a small script but is now a 600 lines, kind of turn-key solution for everyone with basic notions of bash... or and NVIDIA card

You can find it on my Github, it was tested on my 12TB collection of (family) videos so must have patched the most common holes (and if it is not the case, I have timeout fallbacks)

Hope it will be useful to any of you ! No particular licence, do what you want with it :)

https://github.com/PhilGoud/H265-batch-encoder/

(If it is not the good subreddit, please be kind^^)

19
 
 
The original post: /r/datahoarder by /u/tinpanalleypics on 2025-07-09 00:42:30.

I remember years ago I'd buy an external HDD and when I needed it internal I would just strip the case. Can you do that with external SATA 2.5 SSDs?  I need a 2TB (ideally 4TB) to travel around with for a few months and I was just going to put it in a USB3.0 enclosure I have. But I'm seeing such great deals on externals today. I just don't want to get stuck only having it as an external in the future.

And, add on question, does it really matter anymore what brand 2.5 SATA one buys? Must it be Samsung?

Thank you!

20
 
 
The original post: /r/datahoarder by /u/SonicAwareness on 2025-07-09 00:38:56.

I need to add every video in a YouTube channel into a playlist. What's the best way to achieve this?

21
 
 
The original post: /r/datahoarder by /u/Eskel5 on 2025-07-08 22:59:48.

How do you guys keep up with going through storage really fast?

I love Datahoarding and archiving so much but sometimes it can get rough with the amount of storage that gets used very fast! In the beginning of the year I had 40TB in my Unraid Server then as of yesterday I have 94TB.

I bought three 18TB drives off and on this year from Serverpartdeals since I kept running lower on space from filling them up.

Sometimes I can use up 10-15TB or so in a month on my server.

I love this hobby but damn it's an addiction but I love it....

Edit: It's the hoarding

22
 
 
The original post: /r/datahoarder by /u/blueadv1 on 2025-07-08 22:40:34.

We’ve had 2 Verbatim drives fail within the last month. Before that, multiple Kingston drives have failed us. I had a Sandisk that lasted a while but also failed after <6 months of usage. The drives are not abused in any way. They go back into a drawer after every backup.

We need something reliable going forward! We’ve been getting recommendations from the employees at Staples/Best Buy, but obviously they don’t know what they’re talking about.

We back up <1gb of data twice a day, every day. We select the option to overwrite the destination drive with the newer version from the hard drive. Not sure if this matters, but thought I would throw it out there in case it does.

What would be the absolute best and most reliable usb drive for us to get? Please include the make, model, and ideally a link if you can. Thank you so much in advance!

Edit to add that we only have regular USB slots on the computers, not the newer USB-C type.

23
 
 
The original post: /r/datahoarder by /u/therealjoemontana on 2025-07-08 22:21:58.

I was looking at DIY nas solutions and hear a lot of people mention Raspberry pi and mini pc solutions but for $20, the Onn 4k android tv box seems like it could have potential right? Has anyone here tried this? Any suggestions or opinions?

24
 
 
The original post: /r/datahoarder by /u/coljung on 2025-07-08 22:18:34.

Not sure this is the right place for this, for here it goes.

So, I have an EX4100 system with 2x 4TB drives, i have them in JBOD configuration.

I'm honestly confused by several things.

  • When i go into the homepage of my local mycloud page, it says '3.65TB free'.
  • At the same time, when I look at the actual unit, it's telling me one drive is 95% full.
  • Nowhere in the mycloud page there is any mention of either drive being full.
  • I ran the TreeSize app and it tells me local folder is 4TB in size.

So, I honestly don't know where to begin. I though a JBOD configuration would spread files randomly between both drives. How is it possible that while i have 2x 4TB, one seems to be 95% full. And that the local device-local interface does not mention this anywhere, I can't see any tool or anything that could help me solve this problem.

Maybe only one drive is being used so far? If that is the case how can i corroborate this?

Thanks!

25
 
 
The original post: /r/datahoarder by /u/trytoholdon on 2025-07-08 21:46:48.

Hey all,

I want to rip my large collection of Blu-rays and store them on a large HDD connected to a Ugoos Am6b+ for lossless playback.

I found a deal for a 22 TB Seagate Expansion for $215. The model number is STKP22000400 and from what I’ve read online, it’s a bit of a crapshoot as to which drive you’ll find inside, but most are Barracudas.

Should I go with that or with a recertified Exos off eBay? I see some 20 TB ones for around the same price.

What worries me about the Barracuda is that the spec sheet shows it’s rated for about half the number of hours per year as the Exos, but I don’t know if that reflects reality.

Any suggestions would be appreciated!

Edit: *not shucked (at least not until I fill up the drive and move it to an NAS; for now it’ll be directly connected to the Ugoos via USB)

view more: next ›