summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGravatar https://launchpad.net/~stephane-gourichon-lpad <stephane-gourichon-lpad@web>2016-10-28 20:40:54 +0000
committerGravatar admin <admin@branchable.com>2016-10-28 20:40:54 +0000
commit1bc085e3a7233fd9e333fdae59eb8c12ea07fe6f (patch)
tree9fdaa4852710221a5dc9edf3c7a297e04659000e
parent950115b67e22665c3ddf8b0d58f1bb8313335227 (diff)
Added a comment: Like it's written: annex only
-rw-r--r--doc/git-annex-reinject/comment_1_070a87e0cb1bbc49088989293334e1fb._comment48
1 files changed, 48 insertions, 0 deletions
diff --git a/doc/git-annex-reinject/comment_1_070a87e0cb1bbc49088989293334e1fb._comment b/doc/git-annex-reinject/comment_1_070a87e0cb1bbc49088989293334e1fb._comment
new file mode 100644
index 000000000..07f5ec381
--- /dev/null
+++ b/doc/git-annex-reinject/comment_1_070a87e0cb1bbc49088989293334e1fb._comment
@@ -0,0 +1,48 @@
+[[!comment format=mdwn
+ username="https://launchpad.net/~stephane-gourichon-lpad"
+ nickname="stephane-gourichon-lpad"
+ avatar="http://cdn.libravatar.org/avatar/02d4a0af59175f9123720b4481d55a769ba954e20f6dd9b2792217d9fa0c6089"
+ subject="Like it's written: annex only"
+ date="2016-10-28T20:40:54Z"
+ content="""
+# Summary
+
+Just to make it explicit: `--known` mode operates on the *annex only*. If trying to reinject a file that is stored in the regular git part of the repository, and therefore practically known, `git-annex-reinject` will consider it *not known*.
+
+# Context
+
+I'm currently using `git-annex reinject --known` to tidy a pre-git-annex storage. It gets progressively near-emptied of big files, letting unknown files stand out in the deserted directory hierarchy.
+
+Yet only actually annexed files will get removed.
+
+In my case big files are pictures (NEF, JPG), and regular git files are `xmp` metadata files used by http://darktable.org/ to store processing parameters. So, all xmp files linger there, whether they were committed in git or not, needing separate handling.
+
+# How to detect if a file is known to regular git repository (not annex).
+
+There must be a number of ways. I just hacked one:
+
+```
+HASH=$( git hash-object \"$FILEPATH\" )
+if $( git cat-file -e \"$HASH\" )
+then
+ echo \"Known $FILEPATH\"
+else
+ echo \"Unknown $FILEPATH\"
+fi
+```
+
+This can be wrapped into a helper function and used in a `find | ...` one-liner to remove any file already known to git.
+
+## Caveats
+
+`git cat-file` will probably consider known any file actually stored within git objects, even if on an deleted branch or whatever situations where it is not reachable. As a result, removing files based on this test may well lose information, not immediately, but on some subsequent `git gc`.
+
+Such caveat is not surprising, as regular git content and annexed content have differing \"scopes\"/lifetime.
+
+# Question
+
+Joey, is there an alternative to `git-annex-reinject --known` that considers regular git content, too? Perhaps it's a pure git issue and therefore not something inside git-annex job?
+
+A quick test of `git-annex-import --clean-duplicates` shows similar behavior.
+
+"""]]