Like just some tool where I select all the directories then it runs a checksum against everything then tell me which files match.
The windows tool Beyond Compare is exceptional for passing huge directory trees and hi highlighting the conflicts, letting you drill into the file content and insoect line-level changes (assuming the content is understood, does OK with images and documents but certainly better with text, code/scripts)
Beyond Compare is also available on Linux
Is it?! That’s fantastic!!
There’s a Linux tool called
cmpthat compares two files byte for byte. On my distro it’s part of the diffutils package which is required and installed by default.Its better known sibling is
diffwhich is used for finding differences between source code files, or any other text files for that matter.You could build something fairly quickly that wrapped
cmpand a list of files.Alternatively you could look for a duplicate file detector, but then, those generally only pick up on the duplicates and won’t show non-matching files. You’d be blind to the changed ones unless you already knew where they were supposed to be.
Also be aware that on modern filesystems, there’s such a thing as a hard link where two or more filenames can point at the same data on the disk. Those two files will always compare as being the same because they literally are the same. And some filesystems can automatically de-duplicate by creating hard links between anything it detects as being identical.
You might be able to leverage that as well, depending on what you need.
Finally, many files have various dates and times associated with them, again depending on file system. The Linux
statcommand is aware of four of these: File birth (original creation), last access, last modification and last status change. Some or all of these may be combined depending on the underlying filesystem.It’ll depend on what OS you’re using. On linux you’d probably want to use
sha1sumto generate a list of checksums of the files in one directory, then use it to check the other durectories and it’ll tell you if any files don’t match.You could do a dry run with rsync and see what it reports needs update.
I think you’d have to force it to use checksums instead of time stamps.



