summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGravatar Joey Hess <joey@kitenet.net>2013-11-04 14:34:34 -0400
committerGravatar Joey Hess <joey@kitenet.net>2013-11-04 14:34:34 -0400
commit367ddd8e6cc7d39bf9002024d100847219c47977 (patch)
tree9a7ed8a9fc3b80113dd83c142c9911a4608a6db5
parent2e808304f546536277adf7611e19a0c4f7108dfe (diff)
some thoughts for madduck
-rw-r--r--doc/internals.mdwn2
-rw-r--r--doc/internals/lockdown.mdwn44
2 files changed, 45 insertions, 1 deletions
diff --git a/doc/internals.mdwn b/doc/internals.mdwn
index 84d9bcb50..b508c3348 100644
--- a/doc/internals.mdwn
+++ b/doc/internals.mdwn
@@ -16,7 +16,7 @@ Each subdirectory has the [[name_of_a_key|key_format]] in one of the
[[key-value_backends|backends]]. The file inside also has the name of the key.
This two-level structure is used because it allows the write bit to be removed
from the subdirectories as well as from the files. That prevents accidentially
-deleting or changing the file contents.
+deleting or changing the file contents. See [[permissions]] for details.
In [[direct_mode]], file contents are not stored in here, and instead
are stored directly in the file. However, the same symlinks are still
diff --git a/doc/internals/lockdown.mdwn b/doc/internals/lockdown.mdwn
new file mode 100644
index 000000000..a7775cd3a
--- /dev/null
+++ b/doc/internals/lockdown.mdwn
@@ -0,0 +1,44 @@
+Object files stored in `.git/annex/objects` are each put in their own directory.
+This allows the write bit to be removed from both the file, and its directory,
+which prevents accidentially deleting or changing the file contents.
+
+The reasoning for doing this follows:
+
+Normally with git, once you have committed a file, editing the file in the
+working tree cannot cause you to lose the committed version. This is an
+important property of git. Of course you can `rm -rf .git` and delete
+commits if you like (before you've pushed them). But you can't lose a
+committed version of the file because of something you do with the working
+tree version.
+
+It's easy for git to do this, because committing a file makes a copy of it.
+But git-annex does not make a local copy of a file added to it, because
+the file could be very large.
+
+So, it's important for git-annex to find another way to preserve the expected
+property that once committed, you cannot accidentially lose a file.
+The most important protection it makes is just to remove the write bit of
+the file. Thus preventing programs from modifying it.
+
+But, that does not prevent any program that might follow the symlink and
+delete the symlinked file. This might seem an unlikely thing for a program to
+do at first, but consider a command like:
+`tar cf foo.tar foo --remove-files --dereference`
+
+When I tested this, I didn't know if it would remove the file foo symlinked
+to or not! It turned out that my tar doesn't remove it. But it could
+have easily went the other way.
+
+Rather than needing to worry about every possible program that might
+decide to do something like this, git-annex removes the write bit from the
+directory containing the annexed object, as well as removing the write
+bit from the file. (The only bad consequence of this is that `rm -rf .git`
+doesn't work unless you first run `chmod -R +w .git`)
+
+----
+
+It's known that this lockdown mechanism is incomplete. The worst hole in
+it is that if you explicitly run `chmod +w` on an annexed file in the working
+tree, this follows the symlink and allows writing to the file. It would be
+better to make the files fully immutable. But most systems either don't
+support immutable attributes, or only let root make files immutable.