diff options
author | Joey Hess <joeyh@joeyh.name> | 2015-11-24 11:39:47 -0400 |
---|---|---|
committer | Joey Hess <joeyh@joeyh.name> | 2015-11-24 11:39:47 -0400 |
commit | 026d9f537a9f95d6752d6afc13d149cb594796ea (patch) | |
tree | 927126377c418afda4d49f073a705b4742b52160 /doc/todo | |
parent | e0319618f008f4838a0636a3289c2d489fda7cf5 (diff) |
file map analysis
Diffstat (limited to 'doc/todo')
-rw-r--r-- | doc/todo/smudge.mdwn | 81 |
1 files changed, 65 insertions, 16 deletions
diff --git a/doc/todo/smudge.mdwn b/doc/todo/smudge.mdwn index 335f69be8..aea0c9b98 100644 --- a/doc/todo/smudge.mdwn +++ b/doc/todo/smudge.mdwn @@ -162,8 +162,8 @@ Data: be a message saying that the file's content is not currently available. An annex pointer file is checked into the git repository the same way that an annex symlink is checked in. -* file2key maps are maintained by git-annex, to keep track of - what files are pointers at keys. +* A file map is maintained by git-annex, to keep track of the keys + that are used by files in the working tree. Configuration: @@ -206,16 +206,16 @@ git-annex clean: also drop that copy once the object gets uploaded to another repo ... But that gets complicated quickly. - Update file2key map. + Update file map. Output the pointer file content to stdout. git-annex smudge: -* Run by eg `git checkout` and passed the filename, as well as fed - the pointer file content on stdin. +* Run by eg `git checkout` + and passed the filename, as well as fed the pointer file content on stdin. - Updates file2key map. + Update file map. When an object is present in the annex, outputs its content to stdout. Otherwise, outputs the file pointer content. @@ -242,16 +242,65 @@ git annex lock/unlock: itself to break such a hard link. Always finish by locking down the permissions of the annex object. -All other git-annex commands that look at annex symlinks to get keys will -need fall back to checking if a given work tree file is stored in git as -pointer file. This can be done by checking the file2key map (or by looking -it up in the index). - -Note that I have not verified if file2key maps can be maintained -consistently using the smudge/clean filters. Seems likely to work, -based on when I see smudge/clean filters being run. The file2key -optimisation may not be needed though, looking at the index -might be fast enough. +#### file map + +The file map needs to map from `Key -> [File]`. `File -> Key` +seems useful to have, but in practice is not worthwhile. + +Drop and get operations need to know what files in the work tree use a +given key in order to update the work tree. + +git-annex commands that look at annex symlinks to get keys to act on will +need fall back to either consulting the file map, or looking at the staged +file to see if it's a pointer to a key. So a `File -> Key` map is a possible +optimisation. + +Question: If the smudge/clean filters update the file map incrementally +based on the pointer files they generate/see, will the result +always be consistent with the content of the working tree? + +This depends on when git calls the smudge/clean filters and on what. +In particular: + +* Does the clean filter always get called when adding a relevant + file to git? Yes. +* Is the clean filter called at any other time? Yes, for example + git diff will clean relevant modified files to generate the diff. + So, the clean filter may see file versions that have not yet been staged + in git. +* Is the clean filter ever passed content not in the work tree? + I don't think so, but not 100% sure. +* Is the smudge filter always called when git updates a relevant file + in the work tree? Yes. +* Is the smudge filter called at any other time? Seems unlikely but then + there could be situations with a detached work tree or such. +* Does git call any useful hooks when removing a file from the work tree, + or converting it to not be annexed? + No! + +From this analysis, any file map generated by the smudge/clean filters +is necessary potentially innaccurate. It may list deleted files. +It may or may not reflect current unstaged changes from the work tree. + +Follows that any use of the file map needs to verify the info from it, +and throw out bad cached info (updating the map to match reality). + +When downloading a key, check if the files listed in the file map are +still pointer files in the work tree, and only replace them with the +content if so. + +When dropping a key, check if the files listed for it in the file map are +unmodified in the work tree, and are staged as pointers to the key, +and only reset them to the pointers if so. Note that this means that +a modified work tree file that has not yet been staged, but that +corresponds to a key, won't be reset when the key is dropped. +This is probably not a big deal; the user will either add the +file, which will add the key back, or reset the file. + +Does the `File -> Key` map have any benefits given this innaccuracy? +Answer seems to be no; any answer that map gives may be innaccurate and +needs to be verified by looking at actual repo content, so might as well +just look at the repo content in the first place.. #### Upgrading |