aboutsummaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorGravatar Joey Hess <joeyh@joeyh.name>2015-10-09 18:02:35 -0400
committerGravatar Joey Hess <joeyh@joeyh.name>2015-10-09 18:02:35 -0400
commitb65e678e7b557520be4b63eb0e91d88682e1dd42 (patch)
tree49d4c8d40dd77085b113e006f49661c7a12992c9 /doc
parent7433e26d25a2b8b2e0eef9692d1f9ba741bd370c (diff)
massive devblog
Diffstat (limited to 'doc')
-rw-r--r--doc/devblog/day_322-326__concurrent_drop_safety.mdwn49
1 files changed, 49 insertions, 0 deletions
diff --git a/doc/devblog/day_322-326__concurrent_drop_safety.mdwn b/doc/devblog/day_322-326__concurrent_drop_safety.mdwn
new file mode 100644
index 000000000..6b8da87f7
--- /dev/null
+++ b/doc/devblog/day_322-326__concurrent_drop_safety.mdwn
@@ -0,0 +1,49 @@
+Well, I've spent all week making `git annex drop --from` safe.
+
+On Tuesday I got a sinking feeling in my stomach, as I realized that
+there was hole in git-annex's armor to prevent concurrent drops from
+violating numcopies or even losing the last copy of a file.
+[[The bug involved an unlikely race condition|bugs/concurrent_drop--from_presence_checking_failures]],
+and for all I know it's never happened in real life, but still this is not
+good.
+
+Since this is a potential data loss bug, expect a release pretty soon
+with the fix. And, there are 2 things to keep in mind about the fix:
+
+1. If a ssh remote is using an old version of git-annex, a drop may fail.
+ Solution will be to just upgrade the git-annex on the remote to the
+ fixed version.
+2. When a file is present in several special remotes, but not in any
+ accessible git repositories, dropping it from one of the special
+ remotes will now fail, where before it was allowed.
+
+ Instead, the file has to be moved from one of the special remotes to
+ the git repository, and can then safely be dropped from the git repository.
+
+ This is a worrysome behavior change, but unavoidable.
+
+Solving this clearly called for more locking, to prevent concurrency
+problems. But, at first I couldn't find a solution that would allow
+dropping content that was only located on special remotes. I didn't want to
+make special remotes need to involve locking; that would be a nightmare to
+implement, and probably some existing special remotes don't have any
+way to do locking anyway.
+
+Happily, after thinking about it all through Wednesday, I found a solution,
+that while imperfect (see above) is probably the best one feasible. If my
+analysis is correct (and it seems so, although I'd like to write a more
+formal proof than the ad-hoc one I have so far), no locking is needed on
+special remotes, as long as the locking is done just right on the git repos
+and remotes. While this is not able to guarantee that numcopies is always
+preserved, it is able to guarantee that the last copy of a file is never
+removed. And, numcopies *will* always be preserved except for when this
+rare race condition occurs.
+
+So, I've been implementing that all of yesterday and today. Getting it
+right involves building up 4 different kinds of evidence, which can be
+used to make sure that the last copy of a file can't possibly be being
+dropped, no matter what other concurrent drops could be happening.
+I ended up with a very clean and robust implementation of this, and
+a 2 thousand line diff.
+
+Whew!