diff options
Diffstat (limited to 'doc/todo/wishlist:_Provide_a___34__git_annex__34___command_that_will_skip_duplicates/comment_6_f24541ada1c86d755acba7e9fa7cff24._comment')
-rw-r--r-- | doc/todo/wishlist:_Provide_a___34__git_annex__34___command_that_will_skip_duplicates/comment_6_f24541ada1c86d755acba7e9fa7cff24._comment | 16 |
1 files changed, 0 insertions, 16 deletions
diff --git a/doc/todo/wishlist:_Provide_a___34__git_annex__34___command_that_will_skip_duplicates/comment_6_f24541ada1c86d755acba7e9fa7cff24._comment b/doc/todo/wishlist:_Provide_a___34__git_annex__34___command_that_will_skip_duplicates/comment_6_f24541ada1c86d755acba7e9fa7cff24._comment deleted file mode 100644 index 5d8ac8e61..000000000 --- a/doc/todo/wishlist:_Provide_a___34__git_annex__34___command_that_will_skip_duplicates/comment_6_f24541ada1c86d755acba7e9fa7cff24._comment +++ /dev/null @@ -1,16 +0,0 @@ -[[!comment format=mdwn - username="http://joey.kitenet.net/" - nickname="joey" - subject="comment 6" - date="2011-12-22T16:39:24Z" - content=""" -My main concern with putting this in git-annex is that finding duplicates necessarily involves storing a list of every key and file in the repository, and git-annex is very carefully built to avoid things that require non-constant memory use, so that it can scale to very big repositories. (The only exception is the `unused` command, and reducing its memory usage is a continuing goal.) - -So I would rather come at this from a different angle.. like providing a way to output a list of files and their associated keys, which the user can then use in their own shell pipelines to find duplicate keys: - - git annex find --include '*' --format='${file} ${key}\n' | sort --key 2 | uniq --all-repeated --skip-fields=1 - -Which is implemented now! - -(Making that pipeline properly handle filenames with spaces is left as an exercise for the reader..) -"""]] |