this post was submitted on 07 Jun 2025
1 points (100.0% liked)

It's A Digital Disease!

21 readers
2 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 2 years ago
MODERATORS
 
The original post: /r/datahoarder by /u/JamesRitchey on 2025-06-06 20:16:48.

This tutorial is for comparing the contents of 2 folders to confirm they contain the same files, when the filenames, or folder structure are different. This is accomplished by hashing the contents.

Steps:

  • Download Ritchey Hash Directory i2 v2. It's an opensource PHP function I made for hashing directories by treating all the files as part of the input to be hashed.
git clone https://github.com/jamesdanielmarrsritchey/ritchey_hash_directory_i2.git

  • Make a PHP script which uses this function to hash both directories' files, and compare the checksums. To do this, paste the following into "ritchey_hash_directory_i2/custom_script.php" (the file doesn't exist, so you'll need to create it).
<?php
$location = realpath(dirname(__FILE__));

$dir1 = "{$location}/temporary/Example 1"; // Change this!
$dir2 = "{$location}/temporary/Example 1"; // Change this!
$algo = 'sha3-256'; // Optionally, change this. Only select algorithms are supported by the hashing function. For most users 'sha3-256' or 'sha256' should be fine.

require_once $location . '/ritchey_hash_directory_i2_v2.php';
$checksum1 = ritchey_hash_directory_i2_v2($dir1, $algo, FALSE, NULL, TRUE);
$checksum2 = ritchey_hash_directory_i2_v2($dir2, $algo, FALSE, NULL, TRUE);
if (is_string($checksum1) === TRUE && is_string($checksum2) === TRUE){
if ($checksum1 === $checksum2){
echo "Checksums match." . PHP_EOL;
} else {
echo "Checksums differ." . PHP_EOL;
}
} else {
echo "ERROR" . PHP_EOL;
}
?>

(You might need to clean-up the formatting if it doesn't paste nicely)

  • Edit the custom PHP script to have your values for the directories to hash, and the algorithm to use. To do this, change the values of $dir1, $dir2, and $algo.

  • Make any other desired changes (if any) to your script. For example, maybe you want it to display the checksums?

  • Run the script.

cd ritchey_hash_directory_i2 && php custom_script.php && cd -

  • Examine the result. You should get a return that is either "Checksums match." or "Checksums differ.".

Note:

  • The hashing function relies on checksums to decide the order of files for the input when hashing. The order of files for the input impacts the checksum produced. This means collisions between checksums could cause incorrect results, by disrupting the order of the input, so it's advisable to use a strong hashing algorithm, to avoid collisions.

--

There's obviously other ways to do this sort of thing, so please share other programs, scripts you've made, etc. Help save the next person some work :)

EDIT: fixed post formatting

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here