[[!comment format=mdwn username="http://adamspiers.myopenid.com/" nickname="Adam" subject="List the duplicate filenames, then let the user decide what to do" date="2011-12-22T12:31:29Z" content=""" I have the same use case as Asheesh but I want to be able to see which filenames point to the same objects and then decide which of the duplicates to drop myself. I think git annex drop --by-contents would be the wrong approach because how does git-annex know which ones to drop? There's too much potential for error. Instead it would be great to have something like git annex finddups While it's easy enough to knock up a bit of shell or Perl to achieve this, that relies on knowledge of the annex symlink structure, so I think really it belongs inside git-annex. If this command gave output similar to the excellent `fastdup` utility: Scanning for files... 672 files in 10.439 seconds Comparing 2 sets of files... 2 files (70.71 MB/ea) /home/adam/media/flat/tour/flat-tour.3gp /home/adam/videos/tour.3gp Found 1 duplicate of 1 file (70.71 MB wasted) Scanned 672 files (1.96 GB) in 11.415 seconds then you could do stuff like git annex finddups | grep /home/adam/media/flat | xargs rm """]]