I wrote this program to deduplicate files on my backup file server. I would make a snapshot of a computer, or filesystem, and call it something like backup-of-foo-Apr-01-2010 . Later on, I would make another snapshot and call it something like backup-of-foo-May-01-2010. In most cases, the two snapshots had very similar files. I saw no need to keep multiple copies of the same file around, and I saved quite bit of space by removing the duplicates, which is also known as deduplication. When there is a duplicated file, I create a hard-link, and then remove the duplicated file. That way, if you want to recover some or all of the files from a snapshot, you will get all the files that you put in the snapshot. The program should work anywhere there is a python 2.6 or newer interpreter, as long as the filesystem supports hard-links. So don't try this on a windows-nt or newer system using FAT, or VFAT file systems. To use the program, de_dupe.py --total --one --size=1000000 foo-dir bar-dir Contact info: jdeifik@jdeifik.com