[RndTbl] Dupes

Sean Walberg sean at ertw.com
Fri Mar 25 15:11:11 CDT 2011


Assuming file names can be different and that hashes aren't in a db
somewhere already.

md5sum `find / -type f` | sort > md5sums
uniq -c md5sums | egrep -v " *1 " > dupes
grep -f dupes md5sums

Not sure if a sha sum would be faster, but md5sum is embedded in muscle
memory for me. There's probably a one liner out of this.

Also might want to scope that find to stay out of /proc, or to stick into
/home. Exercise for the interested reader ;)

Sean

On Fri, Mar 25, 2011 at 3:01 PM, Kevin McGregor
<kevin.a.mcgregor at gmail.com>wrote:

> Would anyone like to volunteer suggestions for a utility that searches a
> filesystem for duplicate files? I'm running Ubuntu 10.04 on my server, and
> I'm sure I have lots of duplication, which I'd like to get rid of. I'm
> interested in both CLI and GUI solutions.
>
> Kevin
>
> _______________________________________________
> Roundtable mailing list
> Roundtable at muug.mb.ca
> http://www.muug.mb.ca/mailman/listinfo/roundtable
>
>


-- 
Sean Walberg <sean at ertw.com>    http://ertw.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.muug.mb.ca/pipermail/roundtable/attachments/20110325/2bba0f28/attachment.html 


More information about the Roundtable mailing list