diff options
author | Joey Hess <joey@kitenet.net> | 2011-02-25 15:50:17 -0400 |
---|---|---|
committer | Joey Hess <joey@kitenet.net> | 2011-02-25 15:50:17 -0400 |
commit | 3f7c0b6970f58c1484615df7b3ec40f3ee44c9df (patch) | |
tree | 79abe22854aa1c99206dd6220f8c611b4f92940c /doc/todo | |
parent | 5cb24bebf4479f5fc83980e328bce511b36b05fd (diff) |
further investigation
Diffstat (limited to 'doc/todo')
-rw-r--r-- | doc/todo/smudge.mdwn | 32 |
1 files changed, 23 insertions, 9 deletions
diff --git a/doc/todo/smudge.mdwn b/doc/todo/smudge.mdwn index f51f45a39..c51662b28 100644 --- a/doc/todo/smudge.mdwn +++ b/doc/todo/smudge.mdwn @@ -1,14 +1,34 @@ git-annex should use smudge/clean filters. -The trick is doing it efficiently. Since git a2b665d, 2011-01-05, +The clean filter is run when files are staged for commit. So a user could copy +any file into the annex, git add it, and git-annex's clean filter causes +the file's key to be staged, while its value is added to the annex. + +The smudge filter is run when files are checked out. Since git annex +repos have partial content, this would not git annex get the file content. +Instead, if the content is not currently available, it would need to do +something like return empty file content. (Sadly, it cannot create a +symlink, as git still wants to write the file afterwards. + +So the nice current behavior of unavailable files being clearly missing due +to dangling symlinks, would be lost when using smudge/clean filters. +(Contact git developers to get an interface to do this?) + +Instead, we get the nice behavior of not having to remeber to `git annex +add` files, and just being able to use `git add` or `git commit -a`, +and have it use git-annex when .gitattributes says to. Also, annexed +files can be directly modified without having to `git annex unlock`. + +### efficiency + +The trick is doing it efficiently. Since git a2b665d, v1.7.4.1, something like this works to provide a filename to the clean script: git config --global filter.huge.clean huge-clean %f This avoids it needing to read all the current file content from stdin when doing eg, a git status or git commit. Instead it is passed the -filename that git is operating on, I think that's from the working -directory. +filename that git is operating on, in the working directory. So, WORM could just look at that file and easily tell if it is one it already knows (same mtime and size). If so, it can short-circuit and @@ -32,12 +52,6 @@ I've a demo implementation of this technique in the scripts below. ---- -It may further be possible to use the %f with the smudge filter -(docs say it's supported), and instead of outputting the dummy content, -it could create a dangling symlink, which would be more like git-annex's -behavior now, and makes it easy to tell what content is not available -with `ls`. - ### test files huge-smudge: |