summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorGravatar Joey Hess <joeyh@joeyh.name>2015-12-24 19:23:18 -0400
committerGravatar Joey Hess <joeyh@joeyh.name>2015-12-24 19:23:18 -0400
commit819389465d4caedd10e905f0945c60e3fc67c8ea (patch)
tree7c722fbf22d2461fd38fe1986d157de322c17922 /doc
parent378417c70338418ce2fd42643cad5b2f31d7ed8e (diff)
parenta9b36eb958b2dec1cefefe92262965b0f7dceb27 (diff)
Merge branch 'smudge'
Diffstat (limited to 'doc')
-rw-r--r--doc/direct_mode.mdwn7
-rw-r--r--doc/git-annex-add.mdwn18
-rw-r--r--doc/git-annex-direct.mdwn6
-rw-r--r--doc/git-annex-indirect.mdwn5
-rw-r--r--doc/git-annex-init.mdwn7
-rw-r--r--doc/git-annex-lock.mdwn2
-rw-r--r--doc/git-annex-pre-commit.mdwn8
-rw-r--r--doc/git-annex-smudge.mdwn43
-rw-r--r--doc/git-annex-unlock.mdwn12
-rw-r--r--doc/git-annex.mdwn8
-rw-r--r--doc/todo/smudge.mdwn87
-rw-r--r--doc/upgrades.mdwn40
12 files changed, 212 insertions, 31 deletions
diff --git a/doc/direct_mode.mdwn b/doc/direct_mode.mdwn
index 4c2cb2dd7..d3e1067f9 100644
--- a/doc/direct_mode.mdwn
+++ b/doc/direct_mode.mdwn
@@ -9,6 +9,13 @@ understand how to update its working tree.
[[!toc]]
+## deprecated
+
+Direct mode is deprecated! Intead, git-annex v6 repositories can simply
+have files that are unlocked and thus can be directly accessed and
+modified. See [[upgrades]] for details about the transition to v6
+repositories.
+
## enabling (and disabling) direct mode
Normally, git-annex repositories start off in indirect mode. With some
diff --git a/doc/git-annex-add.mdwn b/doc/git-annex-add.mdwn
index cfeb8a98e..7f796fec1 100644
--- a/doc/git-annex-add.mdwn
+++ b/doc/git-annex-add.mdwn
@@ -11,12 +11,18 @@ git annex add `[path ...]`
Adds files in the path to the annex. If no path is specified, adds
files from the current directory and below.
-Normally, files that are already checked into git, or that git has been
-configured to ignore will be silently skipped.
-
-If annex.largefiles is configured, and does not match a file that is being
-added, `git annex add` will behave the same as `git add` and add the
-non-large file directly to the git repository, instead of to the annex.
+Files that are already checked into git and are unmodified, or that
+git has been configured to ignore will be silently skipped.
+
+If annex.largefiles is configured, and does not match a file, `git annex
+add` will behave the same as `git add` and add the non-large file directly
+to the git repository, instead of to the annex.
+
+Large files are added to the annex in locked form, which prevents further
+modification of their content unless unlocked by [[git-annex-unlock]](1).
+(This is not the case however when a repository is in direct mode.)
+To add a file to the annex in unlocked form, `git add` can be used instead
+(that only works when the repository has annex.version 6 or higher).
This command can also be used to add symbolic links, both symlinks to
annexed content, and other symlinks.
diff --git a/doc/git-annex-direct.mdwn b/doc/git-annex-direct.mdwn
index 457ae3116..3cade1a8c 100644
--- a/doc/git-annex-direct.mdwn
+++ b/doc/git-annex-direct.mdwn
@@ -17,12 +17,18 @@ Note that git commands that operate on the work tree will refuse to
run in direct mode repositories. Use `git annex proxy` to safely run such
commands.
+Note that the direct mode/indirect mode distinction is removed in v6
+git-annex repositories. In such a repository, you can
+use [[git-annex-unlock]](1) to make a file's content be directly present.
+
# SEE ALSO
[[git-annex]](1)
[[git-annex-indirect]](1)
+[[git-annex-unlock]](1)
+
# AUTHOR
Joey Hess <id@joeyh.name>
diff --git a/doc/git-annex-indirect.mdwn b/doc/git-annex-indirect.mdwn
index 99def6144..321e0fb36 100644
--- a/doc/git-annex-indirect.mdwn
+++ b/doc/git-annex-indirect.mdwn
@@ -11,9 +11,8 @@ git annex indirect
Switches a repository back from direct mode to the default, indirect
mode.
-Some systems cannot support git-annex in indirect mode, because they
-do not support symbolic links. Repositories on such systems instead
-default to using direct mode.
+Note that the direct mode/indirect mode distinction is removed in v6
+git-annex repositories.
# SEE ALSO
diff --git a/doc/git-annex-init.mdwn b/doc/git-annex-init.mdwn
index 145705105..29522181d 100644
--- a/doc/git-annex-init.mdwn
+++ b/doc/git-annex-init.mdwn
@@ -24,6 +24,13 @@ mark it as dead (see [[git-annex-dead]](1)).
This command is entirely safe, although usually pointless, to run inside an
already initialized git-annex repository.
+# OPTIONS
+
+* `--version=N`
+
+ Force the repository to be initialized using a different annex.version
+ than the current default.
+
# SEE ALSO
[[git-annex]](1)
diff --git a/doc/git-annex-lock.mdwn b/doc/git-annex-lock.mdwn
index 4bf279fb2..b9e5d3450 100644
--- a/doc/git-annex-lock.mdwn
+++ b/doc/git-annex-lock.mdwn
@@ -9,7 +9,7 @@ git annex lock `[path ...]`
# DESCRIPTION
Use this to undo an unlock command if you don't want to modify
-the files, or have made modifications you want to discard.
+the files any longer, or have made modifications you want to discard.
# OPTIONS
diff --git a/doc/git-annex-pre-commit.mdwn b/doc/git-annex-pre-commit.mdwn
index bc1e86e18..21e5aef68 100644
--- a/doc/git-annex-pre-commit.mdwn
+++ b/doc/git-annex-pre-commit.mdwn
@@ -12,10 +12,14 @@ This is meant to be called from git's pre-commit hook. `git annex init`
automatically creates a pre-commit hook using this.
Fixes up symlinks that are staged as part of a commit, to ensure they
-point to annexed content. Also handles injecting changes to unlocked
-files into the annex. When in a view, updates metadata to reflect changes
+point to annexed content.
+
+When in a view, updates metadata to reflect changes
made to files in the view.
+When in a repository that has not been upgraded to annex.version 6,
+also handles injecting changes to unlocked files into the annex.
+
# SEE ALSO
[[git-annex]](1)
diff --git a/doc/git-annex-smudge.mdwn b/doc/git-annex-smudge.mdwn
new file mode 100644
index 000000000..7439c8784
--- /dev/null
+++ b/doc/git-annex-smudge.mdwn
@@ -0,0 +1,43 @@
+# NAME
+
+git-annex smudge - git filter driver for git-annex
+
+# SYNOPSIS
+
+git annex smudge [--clean] file
+
+# DESCRIPTION
+
+This command lets git-annex be used as a git filter driver which lets
+annexed files in the git repository to be unlocked at all times, instead
+of being symlinks.
+
+When adding a file with `git add`, the annex.largefiles config is
+consulted to decide if a given file should be added to git as-is,
+or if its content are large enough to need to use git-annex.
+
+The git configuration to use this command as a filter driver is as follows.
+This is normally set up for you by git-annex init, so you should
+not need to configure it manually.
+
+ [filter "annex"]
+ smudge = git-annex smudge %f
+ clean = git-annex smudge --clean %f
+
+To make git use that filter driver, it needs to be configured in
+the .gitattributes file or in `.git/config/attributes`. The latter
+is normally configured when a repository is initialized, with the following
+contents:
+
+ * filter=annex
+ .* !filter
+
+# SEE ALSO
+
+[[git-annex]](1)
+
+# AUTHOR
+
+Joey Hess <id@joeyh.name>
+
+Warning: Automatically converted into a man page by mdwn2man. Edit with care.
diff --git a/doc/git-annex-unlock.mdwn b/doc/git-annex-unlock.mdwn
index ac8c21185..123146836 100644
--- a/doc/git-annex-unlock.mdwn
+++ b/doc/git-annex-unlock.mdwn
@@ -11,8 +11,16 @@ git annex unlock `[path ...]`
Normally, the content of annexed files is protected from being changed.
Unlocking an annexed file allows it to be modified. This replaces the
symlink for each specified file with a copy of the file's content.
-You can then modify it and `git annex add` (or `git commit`) to inject
-it back into the annex.
+You can then modify it and `git annex add` (or `git commit`) to save your
+changes.
+
+In repositories with annex.version 5 or earlier, unlocking a file is local
+to the repository, and is temporary. With version 6, unlocking a file
+changes how it is stored in the git repository (from a symlink to a pointer
+file), so you can commit it like any other change. Also in version 6, you
+can use `git add` to add a fie to the annex in unlocked form. This allows
+workflows where a file starts out unlocked, is modified as necessary, and
+is locked once it reaches its final version.
# OPTIONS
diff --git a/doc/git-annex.mdwn b/doc/git-annex.mdwn
index 2020ccf3f..1a2fd6e67 100644
--- a/doc/git-annex.mdwn
+++ b/doc/git-annex.mdwn
@@ -626,6 +626,14 @@ subdirectories).
See [[git-annex-diffdriver]](1) for details.
+* `smudge`
+
+ This command lets git-annex be used as a git filter driver, allowing
+ annexed files in the git repository to be unlocked at all times, instead
+ of being symlinks.
+
+ See [[git-annex-smudge]](1) for details.
+
* `remotedaemon`
Detects when network remotes have received git pushes and fetches from them.
diff --git a/doc/todo/smudge.mdwn b/doc/todo/smudge.mdwn
index aea0c9b98..a62e19f68 100644
--- a/doc/todo/smudge.mdwn
+++ b/doc/todo/smudge.mdwn
@@ -158,7 +158,8 @@ Using git-annex on a crippled filesystem that does not support symlinks.
Data:
* An annex pointer file has as its first line the git-annex key
- that it's standing in for. Subsequent lines of the file might
+ that it's standing in for (prefixed with "annex/objects/", similar to
+ an annex symlink target). Subsequent lines of the file might
be a message saying that the file's content is not currently available.
An annex pointer file is checked into the git repository the same way
that an annex symlink is checked in.
@@ -177,8 +178,8 @@ Configuration:
the annex. Other files are passed through the smudge/clean as-is and
have their contents stored in git.
-* annex.direct is repurposed to configure how the assistant adds files.
- When set to true, they're added unlocked.
+* annex.direct is repurposed to configure how git-annex adds files.
+ When set to false, it adds symlinks and when true it adds pointer files.
git-annex clean:
@@ -232,15 +233,11 @@ git annex lock/unlock:
transition repositories to using pointers, and a cleaner unlock/lock
for repos using symlinks.
- unlock will stage a pointer file, and will copy the content of the object
- out of .git/annex/objects to the work tree file. (Might want a --hardlink
- switch.)
+ unlock will stage a pointer file, and will link the content of the object
+ from .git/annex/objects to the work tree file.
- lock will replace the current work tree file with the symlink, and stage it.
- Note that multiple work tree files could point to the same object.
- So, if the link count is > 1, replace the annex object with a copy of
- itself to break such a hard link. Always finish by locking down the
- permissions of the annex object.
+ lock will replace the current work tree file with the symlink, and stage it,
+ and lock down the permissions of the annex object.
#### file map
@@ -248,7 +245,8 @@ The file map needs to map from `Key -> [File]`. `File -> Key`
seems useful to have, but in practice is not worthwhile.
Drop and get operations need to know what files in the work tree use a
-given key in order to update the work tree.
+given key in order to update the work tree. And, we don't want to
+overwrite a work tree file if it's been modified when dropping or getting.
git-annex commands that look at annex symlinks to get keys to act on will
need fall back to either consulting the file map, or looking at the staged
@@ -275,13 +273,14 @@ In particular:
* Is the smudge filter called at any other time? Seems unlikely but then
there could be situations with a detached work tree or such.
* Does git call any useful hooks when removing a file from the work tree,
- or converting it to not be annexed?
+ or converting it to not be annexed, or for `git mv` of an annexed file?
No!
From this analysis, any file map generated by the smudge/clean filters
is necessary potentially innaccurate. It may list deleted files.
It may or may not reflect current unstaged changes from the work tree.
+
Follows that any use of the file map needs to verify the info from it,
and throw out bad cached info (updating the map to match reality).
@@ -306,17 +305,71 @@ just look at the repo content in the first place..
annex.version changes to 6
-Upgrade should be handled automatically.
+git config for filter.annex.smudge and filter.annex.clean is set up.
-On upgrade, update .gitattributes with a stock configuration, unless
-it already mentions "filter=annex".
+.gitattributes is updated with a stock configuration,
+unless it already mentions "filter=annex".
Upgrading a direct mode repo needs to switch it out of bare mode, and
needs to run `git annex unlock` on all files (or reach the same result).
So will need to stage changes to all annexed files.
When a repo has some clones indirect and some direct, the upgraded repo
-will have all files unlocked, necessarily in all clones.
+will have all files unlocked, necessarily in all clones. This happens
+automatically, because when the direct repos are upgraded that causes the
+files to be unlocked, while the indirect upgrades don't touch the files.
+
+#### implementation todo list
+
+* Still a few test suite failues for v6 with locked files.
+* Test suite should make pass for v6 with unlocked files.
+* Reconcile staged changes into the associated files database, whenever
+ the database is queried. This is needed to handle eg:
+ git add largefile
+ git mv largefile othername
+ git annex move othername --to foo
+ # fails to drop content from associated file othername,
+ # because it doesn't know it has that name
+ # git commit clears up this mess
+* Interaction with shared clones. Should avoid hard linking from/to a
+ object in a shared clone if either repository has the object unlocked.
+ (And should avoid unlocking an object if it's hard linked to a shared clone,
+ but that's already accomplished because it avoids unlocking an object if
+ it's hard linked at all)
+* Make automatic merge conflict resolution work for pointer files.
+ - Should probably automatically handle merge conflicts between annex
+ symlinks and pointer files too. Maybe by always resulting in a pointer
+ file, since the symlinks don't work everwhere.
+* Crippled filesystem should cause all files to be transparently unlocked.
+ Note that this presents problems when dealing with merge conflicts and
+ when pushing changes committed in such a repo. Ideally, should avoid
+ committing implicit unlocks, or should prevent such commits leaking out
+ in pushes.
+* Dropping a smudged file causes git status (and git annex status)
+ to show it as modified, because the timestamp has changed.
+ Getting a smudged file can also cause this.
+ Upgrading a direct mode repo also leaves files in this state.
+ User can use `git add` to clear it up, but better to avoid this,
+ by updating stat info in the index.
+ (May need to use libgit2 to do this, cannot find
+ any plumbing except git-update-index, which is very inneficient for
+ smudged files.)
+* Audit code for all uses of isDirect. These places almost always need
+ adjusting to support v6, if they haven't already.
+* Optimisation: See if the database schema can be improved to speed things
+ up. Are there enough indexes? getAssociatedKey in particular does a
+ reverse lookup and might benefit from an index.
+* Optimisation: Reads from the Keys database avoid doing anything if the
+ database doesn't exist. This makes v5 repos, or v6 with all locked files
+ faster. However, if a v6 repo unlocks and then re-locks a file, its
+ database will exist, and so this optimisation will no longer apply.
+ Could try to detect when the database is empty, and remove it or avoid reads.
+
+* Eventually (but not yet), make v6 the default for new repositories.
+ Note that the assistant forces repos into direct mode; that will need to
+ be changed then.
+* Later still, remove support for direct mode, and enable automatic
+ v5 to v6 upgrades.
----
diff --git a/doc/upgrades.mdwn b/doc/upgrades.mdwn
index f5e9cbc3a..9d30c2f14 100644
--- a/doc/upgrades.mdwn
+++ b/doc/upgrades.mdwn
@@ -43,6 +43,46 @@ conflicts first before upgrading git-annex.
The upgrade events, so far:
+## v5 -> v6 (git-annex version 6.x)
+
+The upgrade from v5 to v6 is handled manually. Run `git-annex upgrade`
+perform the upgrade.
+
+Warning: All places that a direct mode repository is cloned to should be
+running git-annex version 6.x before you upgrade the repository.
+This is necessary because the contents of the repository are changed
+in the upgrade, and the old version of git-annex won't be able to
+access files after the repo is upgraded.
+
+This upgrade does away with the direct mode/indirect mode distinction.
+A v6 git-annex repository can have some files locked and other files
+unlocked, and all git and git-annex commands can be used on both locked and
+unlocked files. (Although for locked files to work, the filesystem
+must support symbolic links..)
+
+The behavior of some commands changes in an upgraded repository:
+
+* `git add` will add files to the annex, in unlocked mode, rather than
+ adding them directly to the git repository. To cause some files to be
+ added directly to git, you can configure `annex.largefiles`. For
+ example:
+
+ git config annex.largefiles "largerthan=100kb and not (include=*.c or include=*.h)"
+
+* `git annex unlock` and `git annex lock` change how the pointer to
+ the annexed content is stored in git.
+
+If a repository is only used in indirect mode, you can use git-annex
+v5 and v6 in different clones of the same indirect mode repository without
+problems.
+
+On upgrade, all files in a direct mode repository will be converted to
+unlocked files. The upgrade will stage changes to all annexed files in
+the git repository, which you can then commit.
+
+If a repository has some clones using direct mode and some using indirect
+mode, all the files will end up unlocked in all clones after the upgrade.
+
## v4 -> v5 (git-annex version 5.x)
The upgrade from v4 to v5 is handled