summaryrefslogtreecommitdiff
path: root/doc/todo
diff options
context:
space:
mode:
Diffstat (limited to 'doc/todo')
-rw-r--r--doc/todo/Deduplicate_archive___40__i.e._zip__41___files.mdwn2
-rw-r--r--doc/todo/Deduplicate_archive___40__i.e._zip__41___files/comment_1_619840f2337b018ff165565325cf1a61._comment10
-rw-r--r--doc/todo/wishlist:_annex.largefiles_support_for_mimetypes/comment_3_b2774d265de303143523607053811d23._comment12
3 files changed, 24 insertions, 0 deletions
diff --git a/doc/todo/Deduplicate_archive___40__i.e._zip__41___files.mdwn b/doc/todo/Deduplicate_archive___40__i.e._zip__41___files.mdwn
index 314c283bf..c551b08d7 100644
--- a/doc/todo/Deduplicate_archive___40__i.e._zip__41___files.mdwn
+++ b/doc/todo/Deduplicate_archive___40__i.e._zip__41___files.mdwn
@@ -5,3 +5,5 @@ In this scenario, an online service (Bandcamp), automatically creates the archiv
Would it be possible for git-annex to be able to detect this scenario (in a manner similar to zipcmp) and redirect an add/import to the already existing copy?
I've found this due to trying to decommission an old annex by `git annex import --clean-duplicates ~/annex_old/.git/annex/objects` and finding these files being left.
+
+[[done]] --[[Joey]]
diff --git a/doc/todo/Deduplicate_archive___40__i.e._zip__41___files/comment_1_619840f2337b018ff165565325cf1a61._comment b/doc/todo/Deduplicate_archive___40__i.e._zip__41___files/comment_1_619840f2337b018ff165565325cf1a61._comment
new file mode 100644
index 000000000..19548edb3
--- /dev/null
+++ b/doc/todo/Deduplicate_archive___40__i.e._zip__41___files/comment_1_619840f2337b018ff165565325cf1a61._comment
@@ -0,0 +1,10 @@
+[[!comment format=mdwn
+ username="http://joeyh.name/"
+ ip="209.250.56.7"
+ subject="comment 1"
+ date="2014-08-12T19:51:38Z"
+ content="""
+All you need to do is unzip your zip file, and then `git annex add` or `git annex import` its contents. It will then automatically deduplicate.
+
+Due to the way compression works, two zip (or gz) files with identical contents but different checksums are unlikely to share many bytes in common. So git-annex cannot help with de-duplicating unless you unzip them.
+"""]]
diff --git a/doc/todo/wishlist:_annex.largefiles_support_for_mimetypes/comment_3_b2774d265de303143523607053811d23._comment b/doc/todo/wishlist:_annex.largefiles_support_for_mimetypes/comment_3_b2774d265de303143523607053811d23._comment
new file mode 100644
index 000000000..b2bcd87eb
--- /dev/null
+++ b/doc/todo/wishlist:_annex.largefiles_support_for_mimetypes/comment_3_b2774d265de303143523607053811d23._comment
@@ -0,0 +1,12 @@
+[[!comment format=mdwn
+ username="http://joeyh.name/"
+ ip="209.250.56.7"
+ subject="comment 3"
+ date="2014-08-12T18:58:49Z"
+ content="""
+I think that to support this, the annex.largefiles preferred content expression would need to be supplimented with checks not available in the normal preferred content language.
+
+In general, it's important that preferred content expressions be able to be evaluated without having the file content locally available, and it needs to be possible for a repository to evaluate the preferred content of a sibling repository and know if its sibling wants a file. These things would be defeated by any mime-based expressions. So such expressions should only be available in annex.largefiles and not in other preferred content expressions.
+
+Calling out to `file` or some other external program could work. Although speed can be important. If the assistant is seeing a file frequently change, it's not ideal for it to be repeatedly running `file` on it. There does not seem to be a pure haskell MIME type checking library available at present.
+"""]]