It's A Digital Disease!

23 readers
1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 2 years ago
MODERATORS
1726
 
 
The original post: /r/datahoarder by /u/WorldCitiz3n on 2025-04-25 15:08:56.

I'm super curious, I'd like to get rid of Spotify payments and keep my music on my server but honestly I'm not sure where do I get if from.

1727
 
 
The original post: /r/datahoarder by /u/LowFinal6794 on 2025-04-25 14:44:24.
1728
 
 
The original post: /r/datahoarder by /u/Simplixt on 2025-04-25 14:42:27.

Hi all,

I've the following challenge:

  • I have 2TB of photos

  • Sometimes the same photo is available as RAW, .dmg (converted by lightroom) and JPEG

  • I cannot sort by date (was to lazy to set camera dates every time) and also EXIF are not a 100% indicator

  • the same files can exists multiple times with different file name

How can I handle this mess?

I would need a tool, that:

  • removes all duplicated files (identified via hash/fingerprint independently of file name / exif)

  • compares pixel & exif and keeps the file with the highest quality

  • respects the folder structure, as this is the only way to keep images at the same place that belongs together (as date is not helping)

Any idea? (software can be for MacOS, Windows or Linux)

1729
 
 
The original post: /r/datahoarder by /u/Robert_A2D0FF on 2025-04-25 14:34:49.

I made a little script to download some podcasts, it works fine so far, but one site is using Cloudflare.

I get HTTP 403 errors on the RSS feed and the media files. It thinks I'm not a human, BUT IT'S A FUCKING PODCAST!! It's not for humans, it's meant to be downloaded automatically.

I tried some tricks with the HTTP header (copying the request that is send in a regular browser), but it didn't work.

My phones podcast app can handle the feed, so maybe there is some trick to get past the the CDN.

Ideally there would be some parameter in the HTTP header (user agent?) or the URL to make my script look like a regular podcast app. Or a service that gives me a cached version of the feed and the media file.

Even a slow download with long waiting periods in between would not be a problem.

The podcast hoster is https://www.buzzsprout.com/

In case anyone of you want to test something, here is one podcast with only a few episodes: https://mycatthepodcast.buzzsprout.com/, feed url: https://feeds.buzzsprout.com/2209636.rss

1730
 
 
The original post: /r/datahoarder by /u/shorterround on 2025-04-25 13:35:36.
1731
 
 
The original post: /r/datahoarder by /u/MadDogFenby on 2025-04-25 11:44:41.
1732
 
 
The original post: /r/datahoarder by /u/WorldEnd2024 on 2025-04-25 11:23:28.
1733
 
 
The original post: /r/datahoarder by /u/Itchy-Individual3536 on 2025-04-25 10:33:48.

TL;DR: I inherited a sh*tload of DVDs with bad quality TV recordings and need to get rid of them. Also: VOB or MP4 to keep?

I don't really know what I expect from this post and what I need from you, maybe a strategy, shared experiences, or just the absolution of the data hoarding community to let go.

When my dad passed away in 2023, he left behind a lot of German TV recordings. There are two batches of DVDs:

The first batch is organized by main genre (e.g. fantasy, animation, thriller), though they most often contain videos of other genres too to make the most of the DVDs space (e.g., an action movie DVD would end with two episodes of a kids show because there wasn't enough space left for another full length movie).

These were also stored on external HDDs (I think it was ~12 TB), mostly in VOB as well as MP4 format. This first batch makes up for a couple hundred of DVDs.

The second (more recent) batch consists of six thousand DVDs (all in VOB format) that are just numbered and not stored on any HDD.

He had an excel file listing all the contents with some metadata, and according to this table the two batches accumulate to almost four years of continuous (24/7) watch time with more than 30,000 entries (i.e. individual movies, shows etc.). Physically, the DVDs would take up the space of a wardrobe (the majority is in 100-disc cake boxes, the by-genre ones in slim cases, so not much room to downsize it by repackaging).

Everything that was on the HDDs (except one that failed to read) I copied to my NAS, and with a duplicate finder I could already eliminate a couple of TB, but it's still too much to keep.

I have a tendency for collecting/hoarding data myself (guess who I got that from), but I also realize trying to keep and organize all of it will lead nowhere. I need to at least get rid of the physical disks rather soon, because unlike my parents I'm living in a rather small apartment with no storage room whatsoever, and the cardboxes with DVDs are stacked in the living room right now, which especially for my girlfriend isn't acceptable in the long term (and I agree with her, because I don't see me touching the boxes in the next decades once we would accept them to stay).

I have my problems with just throwing it away:

  • There might be that one movie/TV show I always wanted to rewatch but is too old and/or obscure to be found anywhere to stream or to buy.
  • I have no list of movies/shows I'm missing in my collection, so I can't do just a quick search and match with the excel file to get the interesting ones, but rather I might one day remember an obscure show from the past and find out that it had been in my father's collection.
  • Some TV recordings that are not just movies you find everywhere might be an interesting piece of history (at least to me) or nostalgia someday, like a political comedy from the 2000s or even a commercial break that's in the recording (though they're mostly cut out I think).
  • Seeing how much time I would need to even just go through all of the DVDs "quickly", I can only guess how many years of his life my father put into his hobby (of course I noticed that he always seemed to be recording or editing stuff when I visited, but I only learned now that it accumulates to such an amount of data) - probably also thinking he would do it for the future generations rather than for himself. It feels like this is his legacy or lifetime achievement and I need to respect it and treat it as such.
  • And well, to quote this subreddit's header: "What do you mean DELETE?!" - It just feels wrong.

On the other hand:

  • All videos seem to be in resolution 352x288 (mp4 versions in 320x240 even), that's really blocky, text cannot be read in that resolution. I'm at a loss why he did think that would be an acceptable quality... that said, I'm fine to watch a blocky video if it's the only version of something I cannot find anywhere.
  • Much of it is utter trash I know I would never watch. Like unimportant sports games, concerts, almost every episode of a weekly stand-up comedy show or TV crime film, sentimental romantic TV productions... even my parents never watched the latter, nor anyone in my family, I really don't know for whom he recorded that.
  • Much of the rest I assume is mediocre at best (e.g. movies produced by German TV stations), and if not, can be found on Netflix or elsewhere (except maybe for some movies from the 70s or 80s that weren't blockbusters then).
  • It feels like a big burden to have to go through this in detail because I wouldn't know where to start. It's affecting my mental health having to deal with it and seeing the boxes every day.
  • Seeing how early my parents passed away, I'm thinking about the shortness of life a lot since then, and that I should use my time for more fruitful things than that.

I guess I will now go through the list of 30k entries as quickly as possible, especially for the numbered DVDs, and only if by chance I see a title I'm interested in I will fetch the respective DVD from the boxes and copy that one file, and everything else including titles I haven't heard from I will just throw away.

So yeah, just putting that out there with no real question to you.

Or well, one very concrete question I have: For the videos from the HDDs I already know I want to keep, would you choose to keep the VOB (as said, 352x288) or MP4 (320x240) version, or do another (probably always lossy?) conversion from VOB to MP4? The resolution difference is in some cases noticeable, but VOB I think is sometimes not as well-supported by media players (e.g. in VLC player, sometimes the wrong total time is shown) and long movies are cut into separate parts at the 1GB mark with VOB, which is rather annoying e.g. when I load them into Jellyfin and they show up as two movie versions there. Any other considerations?

1734
 
 
The original post: /r/datahoarder by /u/Plebius-Maximus on 2025-04-25 10:02:34.

Has anyone purchased used HDD's from CEX? They have some ok prices and have recently upped their warranty on everything bar consumables to 5 years.

For my new NAS I've got a couple of new drives in RAID 1 and was considering buying a couple of used NAS/Enterprise ones to fill out the additional bays

1735
 
 
The original post: /r/datahoarder by /u/TheUnknownOne315 on 2025-04-25 09:20:15.

Original Title: Do you think this is a scam? I was hoping to get something close to the $70 Amazon price for 2tb hdd, but without the $50 shipping fee. I know the classic scam (like "16TB" drives for €16), but here it's more like half price, so it got me thinking: is it a really good deal or a more subtle scam?

1736
 
 
The original post: /r/datahoarder by /u/InTrust3 on 2025-04-25 09:09:33.

I am looking for a new 8-12TB HDD for my NAS which stands in my living room.

Firstly i was looking for non-smr new HDDs since that's whats everyone suggested. But i don't really care if the HDD dies some day. It's just for movies and tv shows for Plex/Jellyfin which i can get again if the drive fails.

I already have a 4TB IronWold which is nice but was expensive.

All i want is a cheap silent drive. Any suggestions? Is there only IronWolf and WD Red or are there cheaper options? Recertified would also be fine i guess?

1737
 
 
The original post: /r/datahoarder by /u/appwizcpl on 2025-04-25 09:05:26.

https://www.amazon.de/-/en/Docking-Station-Offline-Tool-Free-DD28C3-C-black/dp/B0C2GV7BWD

There is also a 3.0 variant which is not USB, for a bit cheaper on Ali, but I guess it doesn't matter in terms of speed. How is the chip inside them?

1738
 
 
The original post: /r/datahoarder by /u/manosvk on 2025-04-24 05:55:10.

Hi everyone,

I'm building a budget-conscious storage and virtualization setup and can’t decide between two paths. I’m running a business focused on client backups (small businesses to medium offices), so my focus is capacity and reliability, not raw performance.

Here’s my current situation:

  • I own a Cisco C220 M5 server (plenty of CPU/RAM headroom, dual CPU capable)
  • Already have one Samsung PM1643 (PCIE SSD) for cache
  • I plan to use TrueNAS (Core or Scale) and run about 4 lightweight VMs (Proxmox)
  • I want to use 3.5" SATA drives (like Seagate IronWolf 22TB) for large backup storage
  • The setup will be behind a firewall, WAN-accessible only through VPN for remote clients

Now, I have an offer for a NetApp DS4243 with IOM3, cables, and Dell H200e HBA, all for ~$200

My dilemma:

  1. Should I go for this NetApp shelf (cheap, but old tech – SAS-1, 3Gbps limit)?
  2. Or should I skip it and just build a custom NAS box with a proper case supporting 3.5" disks (e.g. Fractal Define, Node 804, etc.) and re-use CPU/RAM from other hardware?
  3. Alternatively, should I look into other used server options (Dell, HP, Supermicro) that support more 3.5” bays natively?

My goals:

  • ZFS-based TrueNAS storage for backups (with some caching if needed)
  • Quiet and reasonably power-efficient
  • Expandable long-term to 60–100TB of usable capacity
  • Budget is a factor, but I can invest smartly

What would you do?

Is it still worth using these old NetApp shelves in 2025, or should I go a different route?

Thanks in advance for any advice or shared experience – I really appreciate the help from this awesome community!

1739
 
 
The original post: /r/datahoarder by /u/local-host on 2025-04-24 05:37:56.
1740
 
 
The original post: /r/datahoarder by /u/Conscious-Rope7515 on 2025-04-25 07:43:30.

I have 100 or so VHS tapes to copy into a digital archive on PC. I also have a DVD recorder with hard drive (Panasonic DMR-EX769). This is a machine which has built-in capabilities for copying from a VCR, and the results I have achieved so far (by recording to the machine's HDD and then burning a DVD) are pretty good. Specifically, there don't seem to be any TBC issues arising. However, obviously, I then get a DVD, which I then need to copy to my PC.

This is a cumbersome way of going about copying 100 tapes. I'm happy to carry on doing it, if that is the best way of getting good copies of my VHS material within the limitations of the equipment I have. However, if I am going to get the same results - or possibly better - by using a decent video capture card and bypassing the burning-a-DVD stage, I'd like to go down that route. I cannot, however, afford a dedicated TBC and am well aware of the potential issues around TBC.

So, my specific questions are -

  1. Generally, is it likely that I will achieve a better result with a VCC than by burning DVDs, given that I can't afford a dedicated TBC?
  2. Is there anything in the burn to DVD > copy to PC workflow that intrinsically degrades the eventual result below the result you could theoretically achieve going direct to PC (assuming no dedicated TBC)?
  3. Does anybody have any info on whether using the DMR-EX769 as a passthrough helps with TBC in the way an ES10 or ES15 is supposed to?

Many TIA. If it makes any difference, I'm in the UK and so this is a PAL setup.

1741
 
 
The original post: /r/datahoarder by /u/mssing-the-table on 2025-04-25 07:36:56.

Just inherited a broken server (the previous admin retired - leaving no documentation). Upon analysis it seems the OS Debian bookworm was installed on a hardware RAID1 (using MegaRAID 9560-8i 4GB). Root partition had no backup/clone elsewhere. Data is stored on different disks RAID6 and is healthy.

  • since always used software RAID and ZFS, I have no idea.
  • Is it possible to FSCK or run some tools to revive or clone the root/file system as we need the usernames - (i.e) /etc/{shadow,passwd} etc of all the users. Nothing else is needed.
  • Following is the output from storcli

DG/VD TYPE State Access Consist Cache Cac sCC Size Name

1/238 RAID1 OfLn RW No RWBD - ON 446.625 GB

EID:Slt DID State DG Size Intf Med SED PI SeSz Model Sp Type

:12 12 Offln 1 446.625 GB SATA SSD Y N 512B SAMSUNG MZ7L3480HCHQ-00A07 U -

:13 13 Failed 1 446.625 GB SATA SSD Y N 512B SAMSUNG MZ7L3480 U -

I am aware this is not r/techsupport but I know people have significant skills on RAID etc.

I am grateful for any suggestions.

1742
 
 
The original post: /r/datahoarder by /u/nghtr on 2025-04-25 05:40:10.

Not sure if this is just bad luck or a bigger issue, but I figured I’d post in case it helps someone else.

I recently picked up a brand new Sabrent Thunderbolt 3 NVMe enclosure and installed a WD SN750 Gen 3 (4TB) in it. Everything seemed fine — good speeds, temps were okay, no immediate red flags.

But after plugging it in a few times, the SSD just… died. Completely. It’s no longer detected on any system, no response at all. I’ve tried several different methods to revive it, but no luck — it’s just dead.

The drive was working perfectly before — I had about 2TB of data on it that’s probably gone for good now. Super frustrating, and honestly kind of scary that an enclosure could brick a drive like that.

Just wanted to put this out there as a warning. Be careful if you’re planning to use one of these enclosures with a high-capacity NVMe drive.

1743
 
 
The original post: /r/datahoarder by /u/productiveaccount3 on 2025-04-25 05:05:05.

I'm trying to add some features to my server and I'm kinda getting a little scared that I don't have any sort of version control. If any of you all have like a good methodology for version control, be it os snapshots or whatever. All I know is that I can't "git add ." for my entire os, and that's basically all I know how to do honestly.

1744
 
 
The original post: /r/datahoarder by /u/GoupherWood on 2025-04-25 02:56:43.
1745
 
 
The original post: /r/datahoarder by /u/incrediblediy on 2025-04-25 02:51:09.

I was backing up some of my photos when I accidentally burned a 50 GB BD-R with CDFS. I made an ISO with "BurnAware" before burning it. It should have been my error not to check when I made the ISO, as I selected BD DL 50 as the format.

I can't remember when I used CDFS for a disc about 20 years ago. The disc was verified correctly and works well. Would it matter later? I am thinking about error correction capabilities, etc.

Disc ID: VERBAT-IMf-000 (HTL) , is this tier 1 ?

1746
 
 
The original post: /r/datahoarder by /u/IAmARobot on 2025-04-25 02:45:01.

Intraday doesn't matter but that'd be a bonus. Actually while I've got you here do you folks have/know of any good sources for bulk (historical) weather data?

1747
 
 
The original post: /r/datahoarder by /u/YoiMono87 on 2025-04-25 02:15:23.
1748
 
 
The original post: /r/datahoarder by /u/ferminolaiz on 2025-04-25 01:39:37.

So here's the situation: due to electricity costs in my area I'm going to downsize my home server and go from a ~24TB usable pool (raidz2: 6*4TB + raidz2: 6*2TB) to a 16TB usable (raidz2: 6*4TB). All with ZFS.

I mistakingly assumed I could shrink a ZFS pool (I've been following the raidz expansion feature for a while and I must've missunderstood one of the old video presentations), and now I need to create a pool with the disks I'm already using.

I'm currently using around 6TiB and I have a decent internet connection (currently 300/300mbps symmetric, could bump it to 1gbps) so my plan is to find a provider to upload everything, recreate the pool in the new server and then download everything.

In the best case scenario (saturating 1gbps) should be less than 2 days (round trip). Worst case (not saturating 300mbps, only getting 100mbps), the whole ordeal would take around 2 weeks.

I have used backblaze and jottacloud in the past, and although I don't remember the upload speeds for backblaze, jottacloud is definitely out of the question.

One option is going for DigitalOcean/Vultr or another big provider, they are more expensive but I'll have complete control over it and can be sure I'll have a decent uplink, and I can also minimize the time I am using them as they bill hourly.

I'm also contemplating going for a small provider I've used in the past, with whom I have a good relationship. They offer soem KVM boxes at around 7USD/TB.

Anyways, are there any providers you guys would vouch for?

Kind regards and thank you all! This subreddit has been a good source of info in the past :)

1749
 
 
The original post: /r/datahoarder by /u/CyberSkooma on 2025-04-25 01:16:00.

I would really like to download every available game on MyAbandonware from the years 1965 all the way up to 1999. I see a time coming up where I will not have internet access for a long time and I want to have plenty of stuff I can play without using an insane amount of space that modern games would take up if I decided to download most of my steam library. Is there an efficient or smarter way for me to do this? or do I have a very long road ahead me clicking on all of these individually?

1750
 
 
The original post: /r/datahoarder by /u/burnthew1tchh on 2025-04-25 00:41:12.

Pretty new to set ups but have learned a lot of ZFS stuff. We have a Storinator XL60 Enhanced with a X10SRL-F motherboard. My boss wants to see if we can add caching and we have 4 M.2 2280 Gen4x4 NvMe SSD's. But i'm not sure how i will go about adding them.

Right now I have:

4 PCIe 3.0 x8 slots are occupied by SAS HBAs (I assume this is for our HDD's)

My senior's (who has quit) plan was to

Remove 2 HBA. Add a SAS expander maybe? Use the 2 HBA cards to support all 60 drives through the expander

This will free 2 PCIe slots for quad-M.2 PCIe adapter.

But i'm not sure how that will affect our

view more: ‹ prev next ›