The following will search the current dir (and subdirs) for any files that contain identical content and are of identical size, regardless if they are named differently. Open a terminal shell, and 'cd' to the dir you want to search, then type:
find . -size 20 \! -type d -exec cksum {} \; | sort | tee /tmp/f.tmp |
cut -f 1,2 -d ' ' | uniq -d | grep -hif - /tmp/f.tmp > dup.txt[Editor's note: I inserted a carriage return for readability -- type the command on one line when entering it!]
This will produce a list of duplicate files (if any) in dup.txt. True there are some nicely written apps that will do the same thing, but ain't it great that you can do this right from within your OS?
Notes:
- This will ignore files that are smaller than 10k. (remove/alter the '+size 20' to change this). But a warning: really small files may produced identical CRCs. i.e. show up as duplicates even if they really aren't.
- If you want to search a filesystem you don't own (i.e. /) you'll need to sudo or su or 'find' will complain.
- The built-in cksum cmd only uses CRC32. MD5 would be better. Anyone know why it's not enabled under OSX?
- If you're gonna write a script to delete the duplicates from the produced dup.txt list, just remember that it contains ALL instances of the duplicate files.

