From 20d729514adaac04a33f0fbf99016a1276782f81 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Mon, 5 Dec 2011 22:54:30 -0400 Subject: added a branch for the new monad-control 0.3 --- doc/todo/Please_add_support_for_monad-control_0.3.x.mdwn | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/doc/todo/Please_add_support_for_monad-control_0.3.x.mdwn b/doc/todo/Please_add_support_for_monad-control_0.3.x.mdwn index 8b88f103e..ca68c2c91 100644 --- a/doc/todo/Please_add_support_for_monad-control_0.3.x.mdwn +++ b/doc/todo/Please_add_support_for_monad-control_0.3.x.mdwn @@ -1,4 +1,7 @@ Git-annex doesn't compile with the latest version of monad-control. Would it be hard to support that new version? -> I hope not. I have been waiting for it to land in Debian before trying to -> deal with its changes. --[[Joey]] +> I have been waiting for it to land in Debian before trying to +> deal with its changes. +> +> There is now a branch in git called `new-monad-control` that will build +> with the new monad-control. --[[Joey]] -- cgit v1.2.3 From 27ce7ba5b49cf2ec4c8a778f90320d714fd17155 Mon Sep 17 00:00:00 2001 From: "https://www.google.com/accounts/o8/id?id=AItOawnXybLxkPMYpP3yw4b_I6IdC3cKTD-xEdU" Date: Tue, 6 Dec 2011 12:43:56 +0000 Subject: --- ...it_annex_add_crash_and_subsequent_recovery.mdwn | 25 ++++++++++++++++++++++ 1 file changed, 25 insertions(+) create mode 100644 doc/forum/git_annex_add_crash_and_subsequent_recovery.mdwn diff --git a/doc/forum/git_annex_add_crash_and_subsequent_recovery.mdwn b/doc/forum/git_annex_add_crash_and_subsequent_recovery.mdwn new file mode 100644 index 000000000..3f3b943a0 --- /dev/null +++ b/doc/forum/git_annex_add_crash_and_subsequent_recovery.mdwn @@ -0,0 +1,25 @@ +Perhaps stupidly I added some very large bare git repos into a git-annex. + +This took a very long time, used lot's of memory, and then crashed. I didn't catch the error (which is annoying) - sorry about that. IIRC it is the same error if one Ctrl-c's the addition. + +I ran `git annex add .` a second time and eventually killed it (I perhaps should have waited - I now think it was working). + +A `git annex unannex` fixed up some files but somehow I managed to end up with tonnes of files all sym-linked into the git annex object directory but not somehow recognised as annexed files. I'm assuming that they somehow didn't make it into git annex's meta-data layer (or equivalent). + +Commands such as `git annex {fsck,whereis,unannex} weirdfile` immediately returned without error. + +I've now spent a lot of manual time copying the files back. Doing the following, not the cleverest but I was a little panicky about my data... + + find . -type l -exec mv \{} \{}.link \; #Move link names out of the way + find . -type l -exec cp \{} \{}.cp \; #Copy follows links so we can copy target back to link location + find . -type f -name "*.link.cp" | xargs -n 1 rename 's/\.link\.cp//' #Change to original name + find . -type l -exec rm \{} \; #Ditch the links + git annex unused + git annex dropunused `seq 9228` + +9228 files were found to be unused, this gives an idea of the scale of the number of "lost" files for want of a better term. + +A pretty poor bug report as these things go. Anyone any idea what might have happened (it didn't seem space or memory related)? Or how I might have fixed it a little more cleverly? + +For reference I am using stable Debian, git annex version 3.20111011. + -- cgit v1.2.3 From d62b060f0f8f765b8b7f79c453b5f1c363403f19 Mon Sep 17 00:00:00 2001 From: "https://www.google.com/accounts/o8/id?id=AItOawnXybLxkPMYpP3yw4b_I6IdC3cKTD-xEdU" Date: Tue, 6 Dec 2011 12:50:28 +0000 Subject: Added a comment --- ...comment_1_062d0153a379c1ba1df8585b90220d3d._comment | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) create mode 100644 doc/forum/git_annex_add_crash_and_subsequent_recovery/comment_1_062d0153a379c1ba1df8585b90220d3d._comment diff --git a/doc/forum/git_annex_add_crash_and_subsequent_recovery/comment_1_062d0153a379c1ba1df8585b90220d3d._comment b/doc/forum/git_annex_add_crash_and_subsequent_recovery/comment_1_062d0153a379c1ba1df8585b90220d3d._comment new file mode 100644 index 000000000..e879441ff --- /dev/null +++ b/doc/forum/git_annex_add_crash_and_subsequent_recovery/comment_1_062d0153a379c1ba1df8585b90220d3d._comment @@ -0,0 +1,18 @@ +[[!comment format=mdwn + username="https://www.google.com/accounts/o8/id?id=AItOawnXybLxkPMYpP3yw4b_I6IdC3cKTD-xEdU" + nickname="Matt" + subject="comment 1" + date="2011-12-06T12:50:27Z" + content=""" +Ah HA! Looks like I found the cause of this. + + [matt@rss01:~/files/matt_ford]0> git annex add mhs + add mhs/Accessing_Web_Manager_V10.pdf ok + .... + add mhs/MAHSC Costing Request Form Dual + Organisations - FINAL v20 Oct 2010.xls git-annex: unknown response from git cat-file refs/heads/git-annex:8d5/ed4/WORM-s568832-m1323164214--MAHSC Costing Request Form Dual missing + +Spot the file name with a newline character in it! This causes the error message above. It seems that the files proceeding this badly named file are sym-linked but not registered. + +Perhaps a bug? +"""]] -- cgit v1.2.3 From 01f7c74d1fb581ed79c46aa16841a36f5e23e945 Mon Sep 17 00:00:00 2001 From: "https://www.google.com/accounts/o8/id?id=AItOawnXybLxkPMYpP3yw4b_I6IdC3cKTD-xEdU" Date: Tue, 6 Dec 2011 13:05:11 +0000 Subject: --- doc/bugs/bad_behaviour_with_file_names_with_newline_in_them.mdwn | 3 +++ 1 file changed, 3 insertions(+) create mode 100644 doc/bugs/bad_behaviour_with_file_names_with_newline_in_them.mdwn diff --git a/doc/bugs/bad_behaviour_with_file_names_with_newline_in_them.mdwn b/doc/bugs/bad_behaviour_with_file_names_with_newline_in_them.mdwn new file mode 100644 index 000000000..5673d8060 --- /dev/null +++ b/doc/bugs/bad_behaviour_with_file_names_with_newline_in_them.mdwn @@ -0,0 +1,3 @@ +Found this out the hard way. See the comment in the below post for what happens. + +[[/forum/git_annex_add_crash_and_subsequent_recovery/]] -- cgit v1.2.3 From c3d3bf332952811b59a2404ee4c568799245472f Mon Sep 17 00:00:00 2001 From: "https://www.google.com/accounts/o8/id?id=AItOawnXybLxkPMYpP3yw4b_I6IdC3cKTD-xEdU" Date: Tue, 6 Dec 2011 13:36:35 +0000 Subject: --- doc/forum/git_pull_remote_git-annex.mdwn | 11 +++++++++++ 1 file changed, 11 insertions(+) create mode 100644 doc/forum/git_pull_remote_git-annex.mdwn diff --git a/doc/forum/git_pull_remote_git-annex.mdwn b/doc/forum/git_pull_remote_git-annex.mdwn new file mode 100644 index 000000000..349610693 --- /dev/null +++ b/doc/forum/git_pull_remote_git-annex.mdwn @@ -0,0 +1,11 @@ +I thought I'd followed the walk through when initially setting up my repos. + +However I find that I have to do the following to sync my annex's. + + git pull remote master + git checkout git-annex + git pull remote git-annex + git checkout master + git annex get . + +Has something gone wrong? I see no mention of syncing git-annex repos in the walk-through... -- cgit v1.2.3 From 340d206efee926f0b39fedddb6afe0dfec43682f Mon Sep 17 00:00:00 2001 From: "http://joey.kitenet.net/" Date: Tue, 6 Dec 2011 16:43:29 +0000 Subject: Added a comment --- ...ent_1_9c245db3518d8b889ecdf5115ad9e053._comment | 36 ++++++++++++++++++++++ 1 file changed, 36 insertions(+) create mode 100644 doc/forum/git_pull_remote_git-annex/comment_1_9c245db3518d8b889ecdf5115ad9e053._comment diff --git a/doc/forum/git_pull_remote_git-annex/comment_1_9c245db3518d8b889ecdf5115ad9e053._comment b/doc/forum/git_pull_remote_git-annex/comment_1_9c245db3518d8b889ecdf5115ad9e053._comment new file mode 100644 index 000000000..989ab9bcd --- /dev/null +++ b/doc/forum/git_pull_remote_git-annex/comment_1_9c245db3518d8b889ecdf5115ad9e053._comment @@ -0,0 +1,36 @@ +[[!comment format=mdwn + username="http://joey.kitenet.net/" + nickname="joey" + subject="comment 1" + date="2011-12-06T16:43:29Z" + content=""" +You're taking a very long and strange way to a place that you can reach as follows: + +
+git pull remote
+git annex get .
+
+ +Which is just as shown in [[walkthrough/getting_file_content]]. + +In particular, \"git pull remote\" first fetches all branches from the remote, including the git-annex branch. +When you say \"git pull remote master\", you're preventing it from fetching the git-annex branch. +If for some reason you want the slightly longer way around, it is: + +
+git pull remote master
+git fetch remote git-annex
+git annex get .
+
+ +Or, eqivilantly but with less network connections: + +
+git fetch remote
+git merge remote/master
+git annex get .
+
+ +BTW, notice that this is all bog-standard git branch pulling stuff, not specific to git-annex in the least. +Consult your extensive and friendly git documentation for details. :) +"""]] -- cgit v1.2.3 From cf5353acb4746cd0c2e736eecd066bd505555af3 Mon Sep 17 00:00:00 2001 From: "http://joey.kitenet.net/" Date: Tue, 6 Dec 2011 16:49:32 +0000 Subject: Added a comment --- .../comment_1_92dfe6e9089c79eb64e2177fb135ef55._comment | 10 ++++++++++ 1 file changed, 10 insertions(+) create mode 100644 doc/bugs/bad_behaviour_with_file_names_with_newline_in_them/comment_1_92dfe6e9089c79eb64e2177fb135ef55._comment diff --git a/doc/bugs/bad_behaviour_with_file_names_with_newline_in_them/comment_1_92dfe6e9089c79eb64e2177fb135ef55._comment b/doc/bugs/bad_behaviour_with_file_names_with_newline_in_them/comment_1_92dfe6e9089c79eb64e2177fb135ef55._comment new file mode 100644 index 000000000..7ff8f8e3d --- /dev/null +++ b/doc/bugs/bad_behaviour_with_file_names_with_newline_in_them/comment_1_92dfe6e9089c79eb64e2177fb135ef55._comment @@ -0,0 +1,10 @@ +[[!comment format=mdwn + username="http://joey.kitenet.net/" + nickname="joey" + subject="comment 1" + date="2011-12-06T16:49:32Z" + content=""" +This only happens with the WORM backend (or possibly with SHA1E if the file's extension has a newline). + +The problem is not the newline in the file, but the newline in the key generated for the file. It's probably best to just disallow such keys being created. +"""]] -- cgit v1.2.3 From 480495beb4a3422f006ee529df807a20cc944727 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Tue, 6 Dec 2011 13:02:50 -0400 Subject: Prevent key names from containing newlines. There are several places where it's assumed a key can be written on one line. One is in the format of the .git/annex/unused files. The difficult one is that filenames derived from keys are fed into git cat-file --batch, which has a line based input. (And no -z option.) So, for now it's best to block such keys being created. --- Backend.hs | 8 +++++++- Backend/SHA.hs | 10 ++++++---- debian/changelog | 1 + .../bad_behaviour_with_file_names_with_newline_in_them.mdwn | 2 ++ 4 files changed, 16 insertions(+), 5 deletions(-) diff --git a/Backend.hs b/Backend.hs index f7990c22c..136c2eb7a 100644 --- a/Backend.hs +++ b/Backend.hs @@ -64,7 +64,13 @@ genKey' (b:bs) file = do r <- (B.getKey b) file case r of Nothing -> genKey' bs file - Just k -> return $ Just (k, b) + Just k -> return $ Just (makesane k, b) + where + -- keyNames should not contain newline characters. + makesane k = k { keyName = map fixbadchar (keyName k) } + fixbadchar c + | c == '\n' = '_' + | otherwise = c {- Looks up the key and backend corresponding to an annexed file, - by examining what the file symlinks to. -} diff --git a/Backend/SHA.hs b/Backend/SHA.hs index 2ae0cfcf4..7935b6d26 100644 --- a/Backend/SHA.hs +++ b/Backend/SHA.hs @@ -90,10 +90,12 @@ keyValueE size file = keyValue size file >>= maybe (return Nothing) addE , keyBackendName = shaNameE size } naiveextension = takeExtension file - extension = - if length naiveextension > 6 - then "" -- probably not really an extension - else naiveextension + extension + -- long or newline containing extensions are + -- probably not really an extension + | length naiveextension > 6 || + '\n' `elem` naiveextension = "" + | otherwise = naiveextension {- A key's checksum is checked during fsck. -} checkKeyChecksum :: SHASize -> Key -> Annex Bool diff --git a/debian/changelog b/debian/changelog index 7feb8f8ca..a4d3a6492 100644 --- a/debian/changelog +++ b/debian/changelog @@ -2,6 +2,7 @@ git-annex (3.20111204) UNRELEASED; urgency=low * map: Fix a failure to detect a loop when both repositories are local and refer to each other with relative paths. + * Prevent key names from containing newlines. -- Joey Hess Sun, 04 Dec 2011 12:22:37 -0400 diff --git a/doc/bugs/bad_behaviour_with_file_names_with_newline_in_them.mdwn b/doc/bugs/bad_behaviour_with_file_names_with_newline_in_them.mdwn index 5673d8060..530a8da5d 100644 --- a/doc/bugs/bad_behaviour_with_file_names_with_newline_in_them.mdwn +++ b/doc/bugs/bad_behaviour_with_file_names_with_newline_in_them.mdwn @@ -1,3 +1,5 @@ Found this out the hard way. See the comment in the below post for what happens. [[/forum/git_annex_add_crash_and_subsequent_recovery/]] + +> [[fixed|done]] --[[Joey]] -- cgit v1.2.3 From ba74e2069cdf3c69a5c729c53abd73ce11330828 Mon Sep 17 00:00:00 2001 From: "http://joey.kitenet.net/" Date: Tue, 6 Dec 2011 17:08:37 +0000 Subject: Added a comment --- ...omment_2_6fc6be43c488c468a4811cd0a1360225._comment | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) create mode 100644 doc/forum/git_annex_add_crash_and_subsequent_recovery/comment_2_6fc6be43c488c468a4811cd0a1360225._comment diff --git a/doc/forum/git_annex_add_crash_and_subsequent_recovery/comment_2_6fc6be43c488c468a4811cd0a1360225._comment b/doc/forum/git_annex_add_crash_and_subsequent_recovery/comment_2_6fc6be43c488c468a4811cd0a1360225._comment new file mode 100644 index 000000000..38f2434f4 --- /dev/null +++ b/doc/forum/git_annex_add_crash_and_subsequent_recovery/comment_2_6fc6be43c488c468a4811cd0a1360225._comment @@ -0,0 +1,19 @@ +[[!comment format=mdwn + username="http://joey.kitenet.net/" + nickname="joey" + subject="comment 2" + date="2011-12-06T17:08:37Z" + content=""" +The bug with newlines is now fixed. + +Thought I'd mention how to clean up from interrupting `git annex add`. +When you do that, it doesn't get a chance to `git add` the files it's +added (this is normally done at the end, or sometimes at points in the middle when you're adding a *lot* of files). +Which is also why fsck, whereis, and unannex wouldn't operate on them, since they only deal with files in git. + +So the first step is to manually use `git add` on any symlinks. + +Then, `git commit` as usual. + +At that point, `git annex unannex` would get you back to your starting state. +"""]] -- cgit v1.2.3 From adb1dc65bc9342d76d641fa06ec187f3e2c7b783 Mon Sep 17 00:00:00 2001 From: "http://gebi.myopenid.com/" Date: Tue, 6 Dec 2011 20:25:06 +0000 Subject: --- ...sing_rsync_remotes_with_encryption_enabled.mdwn | 40 ++++++++++++++++++++++ 1 file changed, 40 insertions(+) create mode 100644 doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn diff --git a/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn b/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn new file mode 100644 index 000000000..105e24aac --- /dev/null +++ b/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn @@ -0,0 +1,40 @@ +Somehow git-annex has again lost a complete rsync remote with encryption enabled... + +Both *remoteserver* and *localserver* are rsync remotes with enabled encryption. +All commands are executed on the git repository on my laptop. +Target of origin is a gitolite repository without annex support (thus the two rsync remotes). + +Is there a way in git-annex to verify that all files fullfill the numcopies, in my case +numcopies=2, and can be read from the remotes their are on? +I thought that *copy* would verify that, but seems not. + + % g a copy --to remoteserver tools + copy tools/md5_sha1_utility.exe (gpg) (checking remoteserver...) ok + copy tools/win32diskimager-RELEASE-0.2-r23-win32.zip (checking remoteserver...) ok + + % g a copy --to localserver tools + copy tools/md5_sha1_utility.exe (gpg) (checking localserver...) ok + copy tools/win32diskimager-RELEASE-0.2-r23-win32.zip (checking localserver...) ok + + % g a drop tools + drop tools/md5_sha1_utility.exe (gpg) (checking localserver...) (checking remoteserver...) (unsafe) + Could only verify the existence of 1 out of 2 necessary copies + + Try making some of these repositories available: + 718a9b5c-1b4a-11e1-8211-6f094f20e050 -- remoteserver (remote backupserver) + + (Use --force to override this check, or adjust annex.numcopies.) + failed + drop tools/win32diskimager-RELEASE-0.2-r23-win32.zip (checking localserver...) (checking remoteserver...) (unsafe) + Could only verify the existence of 1 out of 2 necessary copies + + Try making some of these repositories available: + 718a9b5c-1b4a-11e1-8211-6f094f20e050 -- remoteserver (remote backupserver) + + (Use --force to override this check, or adjust annex.numcopies.) + failed + git-annex: drop: 2 failed + + % g a fsck tools + fsck tools/md5_sha1_utility.exe (checksum...) ok + fsck tools/win32diskimager-RELEASE-0.2-r23-win32.zip (checksum...) ok -- cgit v1.2.3 From 6f221f1fc3647de5b28179574125f6caa211fe0c Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Tue, 6 Dec 2011 17:06:08 -0400 Subject: response --- ...nnex_losing_rsync_remotes_with_encryption_enabled.mdwn | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn b/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn index 105e24aac..914b45d18 100644 --- a/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn +++ b/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn @@ -1,5 +1,9 @@ Somehow git-annex has again lost a complete rsync remote with encryption enabled... +> "once again" ? When did it do it before? + +> "lost" ? How is the remote lost? + Both *remoteserver* and *localserver* are rsync remotes with enabled encryption. All commands are executed on the git repository on my laptop. Target of origin is a gitolite repository without annex support (thus the two rsync remotes). @@ -38,3 +42,14 @@ I thought that *copy* would verify that, but seems not. % g a fsck tools fsck tools/md5_sha1_utility.exe (checksum...) ok fsck tools/win32diskimager-RELEASE-0.2-r23-win32.zip (checksum...) ok + +> Copy does do an explicit check that the content is present on remoteserver, +> and based on the above, the content was found to be already there, +> which is why it did not copy it again. +> +> Drop does an indentical check that the content is present, and +> since it failed to find it, I am left thinking something must have +> happened to the remove in between the copy and the drop to cause the +> content to go away. +> +> What happens if you copy the data to remoteserver again? --[[Joey]] -- cgit v1.2.3 From 42c81e5dc00988964c1ea264c2a13ca2ce95c78a Mon Sep 17 00:00:00 2001 From: "https://www.google.com/accounts/o8/id?id=AItOawnXybLxkPMYpP3yw4b_I6IdC3cKTD-xEdU" Date: Tue, 6 Dec 2011 23:23:31 +0000 Subject: Added a comment --- .../comment_2_0f7f4a311b0ec1d89613e80847e69b42._comment | 14 ++++++++++++++ 1 file changed, 14 insertions(+) create mode 100644 doc/forum/git_pull_remote_git-annex/comment_2_0f7f4a311b0ec1d89613e80847e69b42._comment diff --git a/doc/forum/git_pull_remote_git-annex/comment_2_0f7f4a311b0ec1d89613e80847e69b42._comment b/doc/forum/git_pull_remote_git-annex/comment_2_0f7f4a311b0ec1d89613e80847e69b42._comment new file mode 100644 index 000000000..198f95cee --- /dev/null +++ b/doc/forum/git_pull_remote_git-annex/comment_2_0f7f4a311b0ec1d89613e80847e69b42._comment @@ -0,0 +1,14 @@ +[[!comment format=mdwn + username="https://www.google.com/accounts/o8/id?id=AItOawnXybLxkPMYpP3yw4b_I6IdC3cKTD-xEdU" + nickname="Matt" + subject="comment 2" + date="2011-12-06T23:23:29Z" + content=""" +Doh! Total brain melt on my part. Thanks for the additional info. Not taking my time and reading things properly - kept assuming that the full remote pull failed due to the warning: + + You asked to pull from the remote 'rss', but did not specify + a branch. Because this is not the default configured remote + for your current branch, you must specify a branch on the command line. + +Rookie mistake indeed. +"""]] -- cgit v1.2.3 From ee26ebe6eb63bc58156bfc418d7692a912390a6a Mon Sep 17 00:00:00 2001 From: "http://gebi.myopenid.com/" Date: Wed, 7 Dec 2011 03:14:24 +0000 Subject: --- doc/users/gebi.mdwn | 1 + 1 file changed, 1 insertion(+) create mode 100644 doc/users/gebi.mdwn diff --git a/doc/users/gebi.mdwn b/doc/users/gebi.mdwn new file mode 100644 index 000000000..121bedbdd --- /dev/null +++ b/doc/users/gebi.mdwn @@ -0,0 +1 @@ +Michael Gebetsroither -- cgit v1.2.3 From 919d58667a65df2ae702ba438dd53c8f202dee8b Mon Sep 17 00:00:00 2001 From: "http://gebi.myopenid.com/" Date: Wed, 7 Dec 2011 03:20:43 +0000 Subject: --- .../git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn b/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn index 914b45d18..968f296dd 100644 --- a/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn +++ b/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn @@ -2,8 +2,15 @@ Somehow git-annex has again lost a complete rsync remote with encryption enabled > "once again" ? When did it do it before? +It's the second time i uploaded all the files to an encrypted rsync remote and git-annex is not able to find it anymore. --[[gebi]] + > "lost" ? How is the remote lost? +git-annex is not able to find any files on the encrypted rsync remote anymore. +Copy does not copy the content again but drop doesn't find it, thus it's somehow "lost" and in an strange state. +I've also had the state where the content was already on the remote side but git-annex copy would copy it again, +ignoring all the data on the remote side. --[[gebi]] + Both *remoteserver* and *localserver* are rsync remotes with enabled encryption. All commands are executed on the git repository on my laptop. Target of origin is a gitolite repository without annex support (thus the two rsync remotes). @@ -53,3 +60,5 @@ I thought that *copy* would verify that, but seems not. > content to go away. > > What happens if you copy the data to remoteserver again? --[[Joey]] + +The commands above are executed within a few seconds and completely repeatable. --[[gebi]] -- cgit v1.2.3 From 51b7b828700438523d56f44d08e07679ef3e849f Mon Sep 17 00:00:00 2001 From: "http://gebi.myopenid.com/" Date: Wed, 7 Dec 2011 05:48:47 +0000 Subject: --- doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn | 2 ++ 1 file changed, 2 insertions(+) diff --git a/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn b/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn index 968f296dd..d727d6917 100644 --- a/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn +++ b/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn @@ -1,5 +1,7 @@ Somehow git-annex has again lost a complete rsync remote with encryption enabled... +git-annex version was 3.20111111 + > "once again" ? When did it do it before? It's the second time i uploaded all the files to an encrypted rsync remote and git-annex is not able to find it anymore. --[[gebi]] -- cgit v1.2.3 From c929f9df0f45db80445f9247f644bb87e8ca0a35 Mon Sep 17 00:00:00 2001 From: "https://www.google.com/accounts/o8/id?id=AItOawnXybLxkPMYpP3yw4b_I6IdC3cKTD-xEdU" Date: Wed, 7 Dec 2011 07:39:16 +0000 Subject: Added a comment --- .../comment_3_45efaaf27d9b580c4c75cbcdc4f65b64._comment | 10 ++++++++++ 1 file changed, 10 insertions(+) create mode 100644 doc/forum/git_annex_add_crash_and_subsequent_recovery/comment_3_45efaaf27d9b580c4c75cbcdc4f65b64._comment diff --git a/doc/forum/git_annex_add_crash_and_subsequent_recovery/comment_3_45efaaf27d9b580c4c75cbcdc4f65b64._comment b/doc/forum/git_annex_add_crash_and_subsequent_recovery/comment_3_45efaaf27d9b580c4c75cbcdc4f65b64._comment new file mode 100644 index 000000000..b58f81c5b --- /dev/null +++ b/doc/forum/git_annex_add_crash_and_subsequent_recovery/comment_3_45efaaf27d9b580c4c75cbcdc4f65b64._comment @@ -0,0 +1,10 @@ +[[!comment format=mdwn + username="https://www.google.com/accounts/o8/id?id=AItOawnXybLxkPMYpP3yw4b_I6IdC3cKTD-xEdU" + nickname="Matt" + subject="comment 3" + date="2011-12-07T07:39:15Z" + content=""" +Ah - very good to know that recovery is easier than the method I used. + +I wonder if it could be made a feature to automatically and safely recover/resume from an interrupted `git add`? +"""]] -- cgit v1.2.3 From 5926be6f307c03f5ed1a02dffff5dce0d987e689 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Wed, 7 Dec 2011 12:36:15 -0400 Subject: response --- ...sing_rsync_remotes_with_encryption_enabled.mdwn | 22 +++++++++++++++++----- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn b/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn index d727d6917..93588b49c 100644 --- a/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn +++ b/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn @@ -4,14 +4,14 @@ git-annex version was 3.20111111 > "once again" ? When did it do it before? -It's the second time i uploaded all the files to an encrypted rsync remote and git-annex is not able to find it anymore. --[[gebi]] +>> It's the second time i uploaded all the files to an encrypted rsync remote and git-annex is not able to find it anymore. --[[gebi]] > "lost" ? How is the remote lost? -git-annex is not able to find any files on the encrypted rsync remote anymore. -Copy does not copy the content again but drop doesn't find it, thus it's somehow "lost" and in an strange state. -I've also had the state where the content was already on the remote side but git-annex copy would copy it again, -ignoring all the data on the remote side. --[[gebi]] +>> git-annex is not able to find any files on the encrypted rsync remote anymore. +>> Copy does not copy the content again but drop doesn't find it, thus it's somehow "lost" and in an strange state. +>> I've also had the state where the content was already on the remote side but git-annex copy would copy it again, +>> ignoring all the data on the remote side. --[[gebi]] Both *remoteserver* and *localserver* are rsync remotes with enabled encryption. All commands are executed on the git repository on my laptop. @@ -64,3 +64,15 @@ I thought that *copy* would verify that, but seems not. > What happens if you copy the data to remoteserver again? --[[Joey]] The commands above are executed within a few seconds and completely repeatable. --[[gebi]] + +> In that case, why don't you run the commands with `-d` to see the actual +> rsync command it's running to check if the content is present. +> Then you can try repeatedly running the command by hand and see why it +> sometimes succeeds and sometimes fail. +> +> The command will be something like this: +> `rsync --quiet hostname:/dir/file 2>/dev/null` +> +> The exit status is what's used to see if content is present -- and +> currently any failure even a failure to connect is taken to mean it's not +> present. --[[Joey]] -- cgit v1.2.3 From d2ff311a3496fc498ad540b194767853ffdc1fc0 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Wed, 7 Dec 2011 13:17:00 -0400 Subject: change footer --- doc/git-annex-shell.mdwn | 2 +- doc/git-annex.mdwn | 2 +- doc/git-union-merge.mdwn | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/doc/git-annex-shell.mdwn b/doc/git-annex-shell.mdwn index fc5bc6c2d..7a65f1077 100644 --- a/doc/git-annex-shell.mdwn +++ b/doc/git-annex-shell.mdwn @@ -78,4 +78,4 @@ Joey Hess -Warning: this page is automatically made into a man page via [mdwn2man](http://git.ikiwiki.info/?p=ikiwiki;a=blob;f=mdwn2man;hb=HEAD). Edit with care +Warning: Automatically converted into a man page by mdwn2man. Edit with care diff --git a/doc/git-annex.mdwn b/doc/git-annex.mdwn index 9df5c3c6d..08def0d62 100644 --- a/doc/git-annex.mdwn +++ b/doc/git-annex.mdwn @@ -623,4 +623,4 @@ Joey Hess -Warning: this page is automatically made into a man page via [mdwn2man](http://git.ikiwiki.info/?p=ikiwiki;a=blob;f=mdwn2man;hb=HEAD). Edit with care +Warning: Automatically converted into a man page by mdwn2man. Edit with care diff --git a/doc/git-union-merge.mdwn b/doc/git-union-merge.mdwn index ed1778910..8e3c34f8f 100644 --- a/doc/git-union-merge.mdwn +++ b/doc/git-union-merge.mdwn @@ -35,4 +35,4 @@ Joey Hess -Warning: this page is automatically made into a man page via [mdwn2man](http://git.ikiwiki.info/?p=ikiwiki;a=blob;f=mdwn2man;hb=HEAD). Edit with care +Warning: Automatically converted into a man page by mdwn2man. Edit with care -- cgit v1.2.3 From 8047bba5b92a6f77ef305c1a74e59b5dacbcc9a2 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Wed, 7 Dec 2011 16:53:53 -0400 Subject: add: If interrupted, add can leave files converted to symlinks but not yet added to git. Running the add again will now clean up this situtation. --- Command.hs | 8 ++++---- Command/Add.hs | 20 ++++++++++++++------ debian/changelog | 2 ++ 3 files changed, 20 insertions(+), 10 deletions(-) diff --git a/Command.hs b/Command.hs index 4d5bbeb36..0cd0bf491 100644 --- a/Command.hs +++ b/Command.hs @@ -13,7 +13,7 @@ module Command ( prepCommand, doCommand, whenAnnexed, - notAnnexed, + ifAnnexed, notBareRepo, isBareRepo, autoCopies, @@ -71,10 +71,10 @@ doCommand = start {- Modifies an action to only act on files that are already annexed, - and passes the key and backend on to it. -} whenAnnexed :: (FilePath -> (Key, Backend Annex) -> Annex (Maybe a)) -> FilePath -> Annex (Maybe a) -whenAnnexed a file = maybe (return Nothing) (a file) =<< Backend.lookupFile file +whenAnnexed a file = ifAnnexed file (a file) (return Nothing) -notAnnexed :: FilePath -> Annex (Maybe a) -> Annex (Maybe a) -notAnnexed file a = maybe a (const $ return Nothing) =<< Backend.lookupFile file +ifAnnexed :: FilePath -> ((Key, Backend Annex) -> Annex a) -> (Annex a) -> Annex a +ifAnnexed file yes no = maybe no yes =<< Backend.lookupFile file notBareRepo :: Annex a -> Annex a notBareRepo a = do diff --git a/Command/Add.hs b/Command/Add.hs index 9fdbdcaa6..9410601b8 100644 --- a/Command/Add.hs +++ b/Command/Add.hs @@ -29,13 +29,21 @@ seek = [withFilesNotInGit start, withFilesUnlocked start] - moving it into the annex directory and setting up the symlink pointing - to its content. -} start :: BackendFile -> CommandStart -start p@(_, file) = notBareRepo $ notAnnexed file $ do - s <- liftIO $ getSymbolicLinkStatus file - if isSymbolicLink s || not (isRegularFile s) - then stop - else do +start p@(_, file) = notBareRepo $ ifAnnexed file fixup add + where + add = do + s <- liftIO $ getSymbolicLinkStatus file + if isSymbolicLink s || not (isRegularFile s) + then stop + else do + showStart "add" file + next $ perform p + fixup (key, _) = do + -- fixup from an interrupted add; the symlink + -- is present but not yet added to git showStart "add" file - next $ perform p + liftIO $ removeFile file + next $ next $ cleanup file key =<< inAnnex key perform :: BackendFile -> CommandPerform perform (backend, file) = Backend.genKey file backend >>= go diff --git a/debian/changelog b/debian/changelog index a4d3a6492..588ea6d5b 100644 --- a/debian/changelog +++ b/debian/changelog @@ -3,6 +3,8 @@ git-annex (3.20111204) UNRELEASED; urgency=low * map: Fix a failure to detect a loop when both repositories are local and refer to each other with relative paths. * Prevent key names from containing newlines. + * add: If interrupted, add can leave files converted to symlinks but not + yet added to git. Running the add again will now clean up this situtation. -- Joey Hess Sun, 04 Dec 2011 12:22:37 -0400 -- cgit v1.2.3 From 2515bb65209cd42c0291be949eaced4365babf8f Mon Sep 17 00:00:00 2001 From: "http://joey.kitenet.net/" Date: Wed, 7 Dec 2011 20:54:51 +0000 Subject: Added a comment --- .../comment_4_c560eae40867512b0af2cbef161fc8ac._comment | 8 ++++++++ 1 file changed, 8 insertions(+) create mode 100644 doc/forum/git_annex_add_crash_and_subsequent_recovery/comment_4_c560eae40867512b0af2cbef161fc8ac._comment diff --git a/doc/forum/git_annex_add_crash_and_subsequent_recovery/comment_4_c560eae40867512b0af2cbef161fc8ac._comment b/doc/forum/git_annex_add_crash_and_subsequent_recovery/comment_4_c560eae40867512b0af2cbef161fc8ac._comment new file mode 100644 index 000000000..8fca16cad --- /dev/null +++ b/doc/forum/git_annex_add_crash_and_subsequent_recovery/comment_4_c560eae40867512b0af2cbef161fc8ac._comment @@ -0,0 +1,8 @@ +[[!comment format=mdwn + username="http://joey.kitenet.net/" + nickname="joey" + subject="comment 4" + date="2011-12-07T20:54:51Z" + content=""" +Good idea! I've made `git annex add` recover when ran a second time. +"""]] -- cgit v1.2.3 From 2568beee07928d1bd462d631372881f146b190ee Mon Sep 17 00:00:00 2001 From: "http://gebi.myopenid.com/" Date: Thu, 8 Dec 2011 14:24:08 +0000 Subject: --- ...x_losing_rsync_remotes_with_encryption_enabled.mdwn | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn b/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn index 93588b49c..af1fed2ee 100644 --- a/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn +++ b/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn @@ -69,10 +69,26 @@ The commands above are executed within a few seconds and completely repeatable. > rsync command it's running to check if the content is present. > Then you can try repeatedly running the command by hand and see why it > sometimes succeeds and sometimes fail. -> + +The commands fail and succeed consistently, not either or. +git annex copy succeeds consistently with not copying the content to remote because it checks and it's already there. + +git annex drop fails consistently with error because content is missing on the exact same remote git annex copy checks +and thinks the content is there. --[[gebi]] + > The command will be something like this: > `rsync --quiet hostname:/dir/file 2>/dev/null` > > The exit status is what's used to see if content is present -- and > currently any failure even a failure to connect is taken to mean it's not > present. --[[Joey]] + +hm... thats interesting, git annex drop and git annex copy check for different hashes on the same file at the same remote... + +git annex drop -d tools/md5_sha1_utility.exe +> Running: sh ["-c","rsync --quiet 'REMOVED_HOST:annex/work/JF/z7/'\"'\"'GPGHMACSHA1--7ffb3840f0e37aee964352e98808403655e8473a/GPGHMACSHA1--7ffb3840f0e37aee964352e98808403655e8473a'\"'\"'' 2>/dev/null"] + +git annex copy --to remoteserver -d tools/md5_sha1_utility.exe +> Running: sh ["-c","rsync --quiet 'REMOVED_HOST:annex/work/1F/PQ/'\"'\"'GPGHMACSHA1--ff075e57f649300c5698e346be74fb6e22d70e35/GPGHMACSHA1--ff075e57f649300c5698e346be74fb6e22d70e35'\"'\"'' 2>/dev/null"] + +And yes, only the hash *annex copy* is checking for exists on the remote side. --[[gebi]] -- cgit v1.2.3 From e3f1568e0ff7dc872f3782115c74b9e7d8c291b2 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Thu, 8 Dec 2011 16:01:46 -0400 Subject: Fix caching of decrypted ciphers, which failed when drop had to check multiple different encrypted special remotes. --- Annex.hs | 5 +++-- Remote/Helper/Encryptable.hs | 23 ++++++++++++---------- Types/Crypto.hs | 2 ++ debian/changelog | 2 ++ ...sing_rsync_remotes_with_encryption_enabled.mdwn | 7 +++++++ 5 files changed, 27 insertions(+), 12 deletions(-) diff --git a/Annex.hs b/Annex.hs index 6d245a92d..d8d9c63b4 100644 --- a/Annex.hs +++ b/Annex.hs @@ -36,6 +36,7 @@ import Types.BranchState import Types.TrustLevel import Types.UUID import qualified Utility.Matcher +import qualified Data.Map as M -- git-annex's monad newtype Annex a = Annex { runAnnex :: StateT AnnexState IO a } @@ -70,7 +71,7 @@ data AnnexState = AnnexState , limit :: Either [Utility.Matcher.Token (FilePath -> Annex Bool)] (Utility.Matcher.Matcher (FilePath -> Annex Bool)) , forcetrust :: [(UUID, TrustLevel)] , trustmap :: Maybe TrustMap - , cipher :: Maybe Cipher + , ciphers :: M.Map EncryptedCipher Cipher } newState :: Git.Repo -> AnnexState @@ -93,7 +94,7 @@ newState gitrepo = AnnexState , limit = Left [] , forcetrust = [] , trustmap = Nothing - , cipher = Nothing + , ciphers = M.empty } {- Create and returns an Annex state object for the specified git repo. -} diff --git a/Remote/Helper/Encryptable.hs b/Remote/Helper/Encryptable.hs index 85d269a21..99f48fe7b 100644 --- a/Remote/Helper/Encryptable.hs +++ b/Remote/Helper/Encryptable.hs @@ -61,19 +61,22 @@ encryptableRemote c storeKeyEncrypted retrieveKeyFileEncrypted r = withkey a k = cip k >>= maybe (a k) (a . snd) cip = cipherKey c -{- Gets encryption Cipher. The decrypted Cipher is cached in the Annex +{- Gets encryption Cipher. The decrypted Ciphers are cached in the Annex - state. -} remoteCipher :: RemoteConfig -> Annex (Maybe Cipher) -remoteCipher c = maybe expensive cached =<< Annex.getState Annex.cipher +remoteCipher c = go $ extractCipher c where - cached cipher = return $ Just cipher - expensive = case extractCipher c of - Nothing -> return Nothing - Just encipher -> do - showNote "gpg" - cipher <- liftIO $ decryptCipher c encipher - Annex.changeState (\s -> s { Annex.cipher = Just cipher }) - return $ Just cipher + go Nothing = return Nothing + go (Just encipher) = do + cache <- Annex.getState Annex.ciphers + case M.lookup encipher cache of + Just cipher -> return $ Just cipher + Nothing -> decrypt encipher cache + decrypt encipher cache = do + showNote "gpg" + cipher <- liftIO $ decryptCipher c encipher + Annex.changeState (\s -> s { Annex.ciphers = M.insert encipher cipher cache }) + return $ Just cipher {- Gets encryption Cipher, and encrypted version of Key. -} cipherKey :: Maybe RemoteConfig -> Key -> Annex (Maybe (Cipher, Key)) diff --git a/Types/Crypto.hs b/Types/Crypto.hs index a9d3dddc5..29a4cd099 100644 --- a/Types/Crypto.hs +++ b/Types/Crypto.hs @@ -11,5 +11,7 @@ module Types.Crypto where newtype Cipher = Cipher String data EncryptedCipher = EncryptedCipher String KeyIds + deriving (Ord, Eq) newtype KeyIds = KeyIds [String] + deriving (Ord, Eq) diff --git a/debian/changelog b/debian/changelog index 588ea6d5b..04e911672 100644 --- a/debian/changelog +++ b/debian/changelog @@ -5,6 +5,8 @@ git-annex (3.20111204) UNRELEASED; urgency=low * Prevent key names from containing newlines. * add: If interrupted, add can leave files converted to symlinks but not yet added to git. Running the add again will now clean up this situtation. + * Fix caching of decrypted ciphers, which failed when drop had to check + multiple different encrypted special remotes. -- Joey Hess Sun, 04 Dec 2011 12:22:37 -0400 diff --git a/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn b/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn index af1fed2ee..0dad8856e 100644 --- a/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn +++ b/doc/bugs/git-annex_losing_rsync_remotes_with_encryption_enabled.mdwn @@ -92,3 +92,10 @@ git annex copy --to remoteserver -d tools/md5_sha1_utility.exe > Running: sh ["-c","rsync --quiet 'REMOVED_HOST:annex/work/1F/PQ/'\"'\"'GPGHMACSHA1--ff075e57f649300c5698e346be74fb6e22d70e35/GPGHMACSHA1--ff075e57f649300c5698e346be74fb6e22d70e35'\"'\"'' 2>/dev/null"] And yes, only the hash *annex copy* is checking for exists on the remote side. --[[gebi]] + +> Ok, this is due to too aggressive caching of the decrypted cipher +> for a remote. When dopping, it decrypts localserver's cipher, +> caches it, and then when checking remoteserver it says hey, +> here's an already decrypted cipher -- it must be the right one! +> +> Problem reproduced here, and fixed. [[done]] --[[Joey]] -- cgit v1.2.3 From e0e40964ab2d0a305fe2503215f306e1228aac24 Mon Sep 17 00:00:00 2001 From: "https://www.google.com/accounts/o8/id?id=AItOawnXybLxkPMYpP3yw4b_I6IdC3cKTD-xEdU" Date: Thu, 8 Dec 2011 22:42:29 +0000 Subject: --- ..._with_no_fixed_hostname_and_optimising_ssh.mdwn | 72 ++++++++++++++++++++++ 1 file changed, 72 insertions(+) create mode 100644 doc/tips/using_git_annex_with_no_fixed_hostname_and_optimising_ssh.mdwn diff --git a/doc/tips/using_git_annex_with_no_fixed_hostname_and_optimising_ssh.mdwn b/doc/tips/using_git_annex_with_no_fixed_hostname_and_optimising_ssh.mdwn new file mode 100644 index 000000000..db108f48b --- /dev/null +++ b/doc/tips/using_git_annex_with_no_fixed_hostname_and_optimising_ssh.mdwn @@ -0,0 +1,72 @@ +## Intro + +This tip is based on my experience of using `git annex` with my out-and-about netbook which hits many different wifi networks and has no fixed home or address. + +I'm not using a bare repository that allows pushing (an alternate way to sort the issue) nor did I fancy allowing `git push` to run against my desktop checked out repository. + +None of this stuff is really `git annex` specific but is useful to know... + +## Dealing with no fixed hostname + +Essentially set up two repos as per the [[walkthrough]]. + +Desktop as follows: + + cd ~/annex + git init + git annex init "desktop" + +And the laptop like this + + git clone ssh://desktop/annex + git init + git annex init "laptop" + +Now we want to add the the repos as remotes of each other. + +For the laptop it is easy: + + git remote add desktop ssh://desktop/~/annex + +However for the desktop to add an ever changing laptops hostname it's a little tricky. We make use of remote SSH tunnels to do this. Essentially we have the laptop which always knows it's own name and address and that of the desktop create a tunnel starting on an arbitrary port on the desktop and heading back to itself on it's own SSH server port (22). + +To do this make part of your laptops SSH config look like this: + + Host desktop + User matt + HostName desktop.example.org + RemoteForward 2222 localhost:22 + +Now on the desktop to connect over the tunnel to the laptops SSH port you need this: + + Host laptop + User matt + HostName localhost + port 2222 + +So to add the desktops remote: + +a) From the laptop ensure the tunnel is up + + ssh desktop + +b) From the desktop add the remote + + git remote add laptop ssh://laptop/~/annex + +So now you can work on the train, pop on the wifi at work upon arrival and sync up with a `git pull && git annex get`. + +An alternate solution maybe to use direct tunnels over Openvpn. + +## Optimising SSH + +Running a `git annex get .`, at least in the version I have, creates a new SSH connection for every file transfer (maybe this is a feature request?) + +Lot's of new small files in an _annex_ cause lot's of connections to be made quickly: this is an relatively expensive overhead and is enough for connection limiting to start in my case. The process can be made much faster by using SSH's connection sharing capabilities. An SSH config like this should do it: + + # Global Settings + ControlMaster auto + ControlPersist 30 + ControlPath ~/.ssh/master-%r@%h:%p + +This will create a master connection for sharing if one isn't present, maintain it for 30 seconds after closing down the connection (just-in-cases') and automatically use the master connection for subsequent connections. Wins all round! -- cgit v1.2.3 From b3ac4af6b0722db35b8295d1131b41c827cdd708 Mon Sep 17 00:00:00 2001 From: "https://www.google.com/accounts/o8/id?id=AItOawnXybLxkPMYpP3yw4b_I6IdC3cKTD-xEdU" Date: Thu, 8 Dec 2011 22:50:45 +0000 Subject: --- ...nnex_with_no_fixed_hostname_and_optimising_ssh.mdwn | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/doc/tips/using_git_annex_with_no_fixed_hostname_and_optimising_ssh.mdwn b/doc/tips/using_git_annex_with_no_fixed_hostname_and_optimising_ssh.mdwn index db108f48b..3f02ea47d 100644 --- a/doc/tips/using_git_annex_with_no_fixed_hostname_and_optimising_ssh.mdwn +++ b/doc/tips/using_git_annex_with_no_fixed_hostname_and_optimising_ssh.mdwn @@ -2,9 +2,9 @@ This tip is based on my experience of using `git annex` with my out-and-about netbook which hits many different wifi networks and has no fixed home or address. -I'm not using a bare repository that allows pushing (an alternate way to sort the issue) nor did I fancy allowing `git push` to run against my desktop checked out repository. +I'm not using a bare repository that allows pushing (an alternative solution) nor do I fancy allowing `git push` to run against my desktop checked out repository (perhaps I worry over nothing?) -None of this stuff is really `git annex` specific but is useful to know... +None of this is really `git annex` specific but I think it is useful to know... ## Dealing with no fixed hostname @@ -28,23 +28,23 @@ For the laptop it is easy: git remote add desktop ssh://desktop/~/annex -However for the desktop to add an ever changing laptops hostname it's a little tricky. We make use of remote SSH tunnels to do this. Essentially we have the laptop which always knows it's own name and address and that of the desktop create a tunnel starting on an arbitrary port on the desktop and heading back to itself on it's own SSH server port (22). +However for the desktop to add an ever changing laptops hostname it's a little tricky. We make use of remote SSH tunnels to do this. Essentially we have the laptop (which always knows it's own name and address and knows the address of the desktop) create a tunnel starting on an arbitrary port at the desktop and heads back to the laptop on it's own SSH server port (22). -To do this make part of your laptops SSH config look like this: +To do this make part of your laptop's SSH config look like this: Host desktop User matt HostName desktop.example.org RemoteForward 2222 localhost:22 -Now on the desktop to connect over the tunnel to the laptops SSH port you need this: +Now on the desktop to connect over the tunnel to the laptop's SSH port you need this: Host laptop User matt HostName localhost port 2222 -So to add the desktops remote: +So to add the desktop's remote: a) From the laptop ensure the tunnel is up @@ -54,13 +54,13 @@ b) From the desktop add the remote git remote add laptop ssh://laptop/~/annex -So now you can work on the train, pop on the wifi at work upon arrival and sync up with a `git pull && git annex get`. +So now you can work on the train, pop on the wifi at work upon arrival, and sync up with a `git pull && git annex get`. -An alternate solution maybe to use direct tunnels over Openvpn. +An alternative solution may be to use direct tunnels over Openvpn. ## Optimising SSH -Running a `git annex get .`, at least in the version I have, creates a new SSH connection for every file transfer (maybe this is a feature request?) +Running a `git annex get .`, at least in the version I have, creates a new SSH connection for every file transfer (maybe this should be a feature request?) Lot's of new small files in an _annex_ cause lot's of connections to be made quickly: this is an relatively expensive overhead and is enough for connection limiting to start in my case. The process can be made much faster by using SSH's connection sharing capabilities. An SSH config like this should do it: -- cgit v1.2.3 From d64132a43ae176e8a1353d5463c5387a93da9ad7 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Fri, 9 Dec 2011 01:57:13 -0400 Subject: hslint --- Annex/Branch.hs | 2 +- Annex/Content.hs | 2 +- Annex/Ssh.hs | 2 +- CmdLine.hs | 2 +- Command.hs | 2 +- Command/Map.hs | 2 +- Command/Status.hs | 5 ++--- Command/Unused.hs | 5 ++--- Git/UnionMerge.hs | 2 +- Logs/UUID.hs | 6 +++--- Remote/Git.hs | 2 +- Upgrade/V2.hs | 2 +- Utility/DataUnits.hs | 2 +- Utility/Directory.hs | 13 ++++++------- configure.hs | 2 +- test.hs | 3 +-- 16 files changed, 25 insertions(+), 29 deletions(-) diff --git a/Annex/Branch.hs b/Annex/Branch.hs index a89066881..8f0a09fd6 100644 --- a/Annex/Branch.hs +++ b/Annex/Branch.hs @@ -141,7 +141,7 @@ update = onceonly $ do let merge_desc = if null branches then "update" else "merging " ++ - (unwords $ map (show . Git.refDescribe) branches) ++ + unwords (map (show . Git.refDescribe) branches) ++ " into " ++ show name unless (null branches) $ do showSideAction merge_desc diff --git a/Annex/Content.hs b/Annex/Content.hs index 90bde2975..3f1db37b5 100644 --- a/Annex/Content.hs +++ b/Annex/Content.hs @@ -43,7 +43,7 @@ import Annex.Exception {- Checks if a given key's content is currently present. -} inAnnex :: Key -> Annex Bool -inAnnex = inAnnex' $ doesFileExist +inAnnex = inAnnex' doesFileExist inAnnex' :: (FilePath -> IO a) -> Key -> Annex a inAnnex' a key = do whenM (fromRepo Git.repoIsUrl) $ diff --git a/Annex/Ssh.hs b/Annex/Ssh.hs index f8cd5d9bc..6893f94ef 100644 --- a/Annex/Ssh.hs +++ b/Annex/Ssh.hs @@ -43,7 +43,7 @@ git_annex_shell r command params shellcmd = "git-annex-shell" shellopts = Param command : File dir : params sshcmd uuid = unwords $ - shellcmd : (map shellEscape $ toCommand shellopts) ++ + shellcmd : map shellEscape (toCommand shellopts) ++ uuidcheck uuid uuidcheck NoUUID = [] uuidcheck (UUID u) = ["--uuid", u] diff --git a/CmdLine.hs b/CmdLine.hs index 78f46a2e3..672969c30 100644 --- a/CmdLine.hs +++ b/CmdLine.hs @@ -32,7 +32,7 @@ dispatch args cmds options header getgitrepo = do setupConsole r <- E.try getgitrepo :: IO (Either E.SomeException Git.Repo) case r of - Left e -> maybe (throw e) id (cmdnorepo cmd) + Left e -> fromMaybe (throw e) (cmdnorepo cmd) Right g -> do state <- Annex.new g (actions, state') <- Annex.run state $ do diff --git a/Command.hs b/Command.hs index 0cd0bf491..86a83e30c 100644 --- a/Command.hs +++ b/Command.hs @@ -73,7 +73,7 @@ doCommand = start whenAnnexed :: (FilePath -> (Key, Backend Annex) -> Annex (Maybe a)) -> FilePath -> Annex (Maybe a) whenAnnexed a file = ifAnnexed file (a file) (return Nothing) -ifAnnexed :: FilePath -> ((Key, Backend Annex) -> Annex a) -> (Annex a) -> Annex a +ifAnnexed :: FilePath -> ((Key, Backend Annex) -> Annex a) -> Annex a -> Annex a ifAnnexed file yes no = maybe no yes =<< Backend.lookupFile file notBareRepo :: Annex a -> Annex a diff --git a/Command/Map.hs b/Command/Map.hs index 6b1e8d5bb..57b48d503 100644 --- a/Command/Map.hs +++ b/Command/Map.hs @@ -203,7 +203,7 @@ tryScan r "git config --list" dir = Git.workTree r cddir - | take 2 dir == "/~" = + | "/~" `isPrefixOf` dir = let (userhome, reldir) = span (/= '/') (drop 1 dir) in "cd " ++ userhome ++ " && cd " ++ shellEscape (drop 1 reldir) | otherwise = "cd " ++ shellEscape dir diff --git a/Command/Status.hs b/Command/Status.hs index 0fefda1f6..09da41987 100644 --- a/Command/Status.hs +++ b/Command/Status.hs @@ -191,9 +191,8 @@ staleSize label dirspec = do keys <- lift (Command.Unused.staleKeys dirspec) if null keys then nostat - else do - stat label $ json (++ aside "clean up with git-annex unused") $ - return $ keySizeSum $ S.fromList keys + else stat label $ json (++ aside "clean up with git-annex unused") $ + return $ keySizeSum $ S.fromList keys aside :: String -> String aside s = " (" ++ s ++ ")" diff --git a/Command/Unused.hs b/Command/Unused.hs index 7f9edfef2..be0107752 100644 --- a/Command/Unused.hs +++ b/Command/Unused.hs @@ -152,13 +152,12 @@ excludeReferenced l = do (S.fromList l) where -- Skip the git-annex branches, and get all other unique refs. - refs = map Git.Ref . - map last . + refs = map (Git.Ref . last) . nubBy cmpheads . filter ourbranches . map words . lines . L.unpack cmpheads a b = head a == head b - ourbranchend = '/' : show (Annex.Branch.name) + ourbranchend = '/' : show Annex.Branch.name ourbranches ws = not $ ourbranchend `isSuffixOf` last ws removewith [] s = return $ S.toList s removewith (a:as) s diff --git a/Git/UnionMerge.hs b/Git/UnionMerge.hs index edc8cb20b..ddbff6a82 100644 --- a/Git/UnionMerge.hs +++ b/Git/UnionMerge.hs @@ -48,7 +48,7 @@ merge_index h repo bs = - earlier ones, so the list can be generated from any combination of - ls_tree, merge_trees, and merge_tree_index. -} update_index :: Repo -> [String] -> IO () -update_index repo ls = stream_update_index repo [\s -> mapM_ s ls] +update_index repo ls = stream_update_index repo [(`mapM_` ls)] {- Streams content into update-index. -} stream_update_index :: Repo -> [Streamer] -> IO () diff --git a/Logs/UUID.hs b/Logs/UUID.hs index 20f43d15c..b325c78b6 100644 --- a/Logs/UUID.hs +++ b/Logs/UUID.hs @@ -55,15 +55,15 @@ fixBadUUID = M.fromList . map fixup . M.toList | otherwise = (k, v) where kuuid = fromUUID k - isbad = (not $ isuuid kuuid) && isuuid lastword + isbad = not (isuuid kuuid) && isuuid lastword ws = words $ value v lastword = last ws fixeduuid = toUUID lastword - fixedvalue = unwords $ kuuid:(take (length ws - 1) ws) + fixedvalue = unwords $ kuuid: init ws -- For the fixed line to take precidence, it should be -- slightly newer, but only slightly. newertime (LogEntry (Date d) _) = d + minimumPOSIXTimeSlice - newertime (LogEntry (Unknown) _) = minimumPOSIXTimeSlice + newertime (LogEntry Unknown _) = minimumPOSIXTimeSlice minimumPOSIXTimeSlice = 0.000001 isuuid s = length s == 36 && length (split "-" s) == 5 diff --git a/Remote/Git.hs b/Remote/Git.hs index 99ca9fe8e..d172ec7c0 100644 --- a/Remote/Git.hs +++ b/Remote/Git.hs @@ -165,7 +165,7 @@ onLocal :: Git.Repo -> Annex a -> IO a onLocal r a = do -- Avoid re-reading the repository's configuration if it was -- already read. - state <- if (M.null $ Git.configMap r) + state <- if M.null $ Git.configMap r then Annex.new r else return $ Annex.newState r Annex.eval state $ do diff --git a/Upgrade/V2.hs b/Upgrade/V2.hs index e76d99b3e..08bf83e83 100644 --- a/Upgrade/V2.hs +++ b/Upgrade/V2.hs @@ -53,7 +53,7 @@ upgrade = do when e $ do inRepo $ Git.run "rm" [Param "-r", Param "-f", Param "-q", File old] - unless bare $ inRepo $ gitAttributesUnWrite + unless bare $ inRepo gitAttributesUnWrite showProgress unless bare push diff --git a/Utility/DataUnits.hs b/Utility/DataUnits.hs index e7552f52f..5d80a04b9 100644 --- a/Utility/DataUnits.hs +++ b/Utility/DataUnits.hs @@ -99,7 +99,7 @@ bandwidthUnits = error "stop trying to rip people off" {- Do you yearn for the days when men were men and megabytes were megabytes? -} oldSchoolUnits :: [Unit] -oldSchoolUnits = map mingle $ zip storageUnits memoryUnits +oldSchoolUnits = zipWith (curry mingle) storageUnits memoryUnits where mingle (Unit _ a n, Unit s' _ _) = Unit s' a n diff --git a/Utility/Directory.hs b/Utility/Directory.hs index 7f8822fca..249ed6935 100644 --- a/Utility/Directory.hs +++ b/Utility/Directory.hs @@ -11,6 +11,7 @@ import System.IO.Error import System.Posix.Files import System.Directory import Control.Exception (throw) +import Control.Monad import Utility.SafeCommand import Utility.Conditional @@ -37,13 +38,11 @@ moveFile src dest = try (rename src dest) >>= onrename mv tmp _ = do ok <- boolSystem "mv" [Param "-f", Param src, Param tmp] - if ok - then return () - else do - -- delete any partial - _ <- try $ - removeFile tmp - rethrow + unless ok $ do + -- delete any partial + _ <- try $ + removeFile tmp + rethrow isdir f = do r <- try (getFileStatus f) case r of diff --git a/configure.hs b/configure.hs index cb73af2a9..0d96b3955 100644 --- a/configure.hs +++ b/configure.hs @@ -71,7 +71,7 @@ checkGitVersion = do dotted = sum . mult 1 . reverse . extend 10 . map readi . split "." extend n l = l ++ replicate (n - length l) 0 mult _ [] = [] - mult n (x:xs) = (n*x) : (mult (n*100) xs) + mult n (x:xs) = (n*x) : mult (n*100) xs readi :: String -> Integer readi s = case reads s of ((x,_):_) -> x diff --git a/test.hs b/test.hs index e625fbd75..91c11873d 100644 --- a/test.hs +++ b/test.hs @@ -11,11 +11,10 @@ import Test.QuickCheck import System.Posix.Directory (changeWorkingDirectory) import System.Posix.Files -import Control.Exception (bracket_, bracket) +import Control.Exception (bracket_, bracket, throw) import System.IO.Error import System.Posix.Env import qualified Control.Exception.Extensible as E -import Control.Exception (throw) import qualified Data.Map as M import System.IO.HVFS (SystemFS(..)) -- cgit v1.2.3 From e664af5d8cf310ac3dcb7e27a5a6846360917fbf Mon Sep 17 00:00:00 2001 From: "https://www.google.com/accounts/o8/id?id=AItOawnXybLxkPMYpP3yw4b_I6IdC3cKTD-xEdU" Date: Fri, 9 Dec 2011 10:03:20 +0000 Subject: --- doc/tips/using_git_annex_with_no_fixed_hostname_and_optimising_ssh.mdwn | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/tips/using_git_annex_with_no_fixed_hostname_and_optimising_ssh.mdwn b/doc/tips/using_git_annex_with_no_fixed_hostname_and_optimising_ssh.mdwn index 3f02ea47d..8fb2bf9db 100644 --- a/doc/tips/using_git_annex_with_no_fixed_hostname_and_optimising_ssh.mdwn +++ b/doc/tips/using_git_annex_with_no_fixed_hostname_and_optimising_ssh.mdwn @@ -1,6 +1,6 @@ ## Intro -This tip is based on my experience of using `git annex` with my out-and-about netbook which hits many different wifi networks and has no fixed home or address. +This tip is based on my (Matt Ford) experience of using `git annex` with my out-and-about netbook which hits many different wifi networks and has no fixed home or address. I'm not using a bare repository that allows pushing (an alternative solution) nor do I fancy allowing `git push` to run against my desktop checked out repository (perhaps I worry over nothing?) -- cgit v1.2.3 From 3f5f28b48754bc91620a6354ca70afe4c61c9894 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Fri, 9 Dec 2011 12:23:45 -0400 Subject: factor out a stopUnless code melt for lunch --- Command.hs | 7 +++++++ Command/AddUrl.hs | 21 +++++++++------------ Command/Drop.hs | 28 +++++++++------------------- Command/DropKey.hs | 17 +++++------------ Command/Fix.hs | 9 +++------ Command/Get.hs | 32 ++++++++++++-------------------- Command/Migrate.hs | 24 ++++++++++-------------- Command/Move.hs | 14 ++++---------- Command/Unannex.hs | 24 ++++++++++-------------- 9 files changed, 69 insertions(+), 107 deletions(-) diff --git a/Command.hs b/Command.hs index 86a83e30c..813a239cb 100644 --- a/Command.hs +++ b/Command.hs @@ -10,6 +10,7 @@ module Command ( noRepo, next, stop, + stopUnless, prepCommand, doCommand, whenAnnexed, @@ -49,6 +50,12 @@ next a = return $ Just a stop :: Annex (Maybe a) stop = return Nothing +{- Stops unless a condition is met. -} +stopUnless :: Annex Bool -> Annex (Maybe a) -> Annex (Maybe a) +stopUnless c a = do + ok <- c + if ok then a else stop + {- Prepares to run a command via the check and seek stages, returning a - list of actions to perform to run the command. -} prepCommand :: Command -> [String] -> Annex [CommandCleanup] diff --git a/Command/AddUrl.hs b/Command/AddUrl.hs index 945848e9f..75ca74031 100644 --- a/Command/AddUrl.hs +++ b/Command/AddUrl.hs @@ -45,18 +45,15 @@ download url file = do let dummykey = Backend.URL.fromUrl url tmp <- fromRepo $ gitAnnexTmpLocation dummykey liftIO $ createDirectoryIfMissing True (parentDir tmp) - ok <- liftIO $ Url.download url tmp - if ok - then do - [(backend, _)] <- Backend.chooseBackends [file] - k <- Backend.genKey tmp backend - case k of - Nothing -> stop - Just (key, _) -> do - moveAnnex key tmp - setUrlPresent key url - next $ Command.Add.cleanup file key True - else stop + stopUnless (liftIO $ Url.download url tmp) $ do + [(backend, _)] <- Backend.chooseBackends [file] + k <- Backend.genKey tmp backend + case k of + Nothing -> stop + Just (key, _) -> do + moveAnnex key tmp + setUrlPresent key url + next $ Command.Add.cleanup file key True nodownload :: String -> FilePath -> CommandPerform nodownload url file = do diff --git a/Command/Drop.hs b/Command/Drop.hs index ee3583869..0a4c9dfd6 100644 --- a/Command/Drop.hs +++ b/Command/Drop.hs @@ -37,13 +37,9 @@ start numcopies file (key, _) = autoCopies key (>) numcopies $ do else startRemote file numcopies key remote startLocal :: FilePath -> Maybe Int -> Key -> CommandStart -startLocal file numcopies key = do - present <- inAnnex key - if present - then do - showStart "drop" file - next $ performLocal key numcopies - else stop +startLocal file numcopies key = stopUnless (inAnnex key) $ do + showStart "drop" file + next $ performLocal key numcopies startRemote :: FilePath -> Maybe Int -> Key -> Remote.Remote Annex -> CommandStart startRemote file numcopies key remote = do @@ -55,12 +51,9 @@ performLocal key numcopies = lockContent key $ do (remotes, trusteduuids) <- Remote.keyPossibilitiesTrusted key untrusteduuids <- trustGet UnTrusted let tocheck = Remote.remotesWithoutUUID remotes (trusteduuids++untrusteduuids) - success <- canDropKey key numcopies trusteduuids tocheck [] - if success - then do - whenM (inAnnex key) $ removeAnnex key - next $ cleanupLocal key - else stop + stopUnless (canDropKey key numcopies trusteduuids tocheck []) $ do + whenM (inAnnex key) $ removeAnnex key + next $ cleanupLocal key performRemote :: Key -> Maybe Int -> Remote.Remote Annex -> CommandPerform performRemote key numcopies remote = lockContent key $ do @@ -75,12 +68,9 @@ performRemote key numcopies remote = lockContent key $ do untrusteduuids <- trustGet UnTrusted let tocheck = filter (/= remote) $ Remote.remotesWithoutUUID remotes (have++untrusteduuids) - success <- canDropKey key numcopies have tocheck [uuid] - if success - then do - ok <- Remote.removeKey remote key - next $ cleanupRemote key remote ok - else stop + stopUnless (canDropKey key numcopies have tocheck [uuid]) $ do + ok <- Remote.removeKey remote key + next $ cleanupRemote key remote ok where uuid = Remote.uuid remote diff --git a/Command/DropKey.hs b/Command/DropKey.hs index b63d481bf..ae2ad8b6a 100644 --- a/Command/DropKey.hs +++ b/Command/DropKey.hs @@ -21,18 +21,11 @@ seek :: [CommandSeek] seek = [withKeys start] start :: Key -> CommandStart -start key = do - present <- inAnnex key - if not present - then stop - else do - checkforced - showStart "dropkey" (show key) - next $ perform key - where - checkforced = - unlessM (Annex.getState Annex.force) $ - error "dropkey can cause data loss; use --force if you're sure you want to do this" +start key = stopUnless (not <$> inAnnex key) $ do + unlessM (Annex.getState Annex.force) $ + error "dropkey can cause data loss; use --force if you're sure you want to do this" + showStart "dropkey" (show key) + next $ perform key perform :: Key -> CommandPerform perform key = lockContent key $ do diff --git a/Command/Fix.hs b/Command/Fix.hs index 27c4b167e..f264106c3 100644 --- a/Command/Fix.hs +++ b/Command/Fix.hs @@ -23,12 +23,9 @@ seek = [withFilesInGit $ whenAnnexed start] start :: FilePath -> (Key, Backend Annex) -> CommandStart start file (key, _) = do link <- calcGitLink file key - l <- liftIO $ readSymbolicLink file - if link == l - then stop - else do - showStart "fix" file - next $ perform file link + stopUnless ((/=) link <$> liftIO (readSymbolicLink file)) $ do + showStart "fix" file + next $ perform file link perform :: FilePath -> FilePath -> CommandPerform perform file link = do diff --git a/Command/Get.hs b/Command/Get.hs index 093cd2cc5..b7023e2de 100644 --- a/Command/Get.hs +++ b/Command/Get.hs @@ -22,32 +22,24 @@ seek :: [CommandSeek] seek = [withNumCopies $ \n -> whenAnnexed $ start n] start :: Maybe Int -> FilePath -> (Key, Backend Annex) -> CommandStart -start numcopies file (key, _) = do - inannex <- inAnnex key - if inannex - then stop - else autoCopies key (<) numcopies $ do - from <- Annex.getState Annex.fromremote - case from of - Nothing -> go $ perform key - Just name -> do - -- get --from = copy --from - src <- Remote.byName name - ok <- Command.Move.fromOk src key - if ok - then go $ Command.Move.fromPerform src False key - else stop +start numcopies file (key, _) = stopUnless (not <$> inAnnex key) $ + autoCopies key (<) numcopies $ do + from <- Annex.getState Annex.fromremote + case from of + Nothing -> go $ perform key + Just name -> do + -- get --from = copy --from + src <- Remote.byName name + stopUnless (Command.Move.fromOk src key) $ + go $ Command.Move.fromPerform src False key where go a = do showStart "get" file next a perform :: Key -> CommandPerform -perform key = do - ok <- getViaTmp key (getKeyFile key) - if ok - then next $ return True -- no cleanup needed - else stop +perform key = stopUnless (getViaTmp key $ getKeyFile key) $ do + next $ return True -- no cleanup needed {- Try to find a copy of the file in one of the remotes, - and copy it to here. -} diff --git a/Command/Migrate.hs b/Command/Migrate.hs index c85d7c2ac..30288fc16 100644 --- a/Command/Migrate.hs +++ b/Command/Migrate.hs @@ -58,22 +58,18 @@ perform file oldkey newbackend = do cleantmp tmpfile case k of Nothing -> stop - Just (newkey, _) -> do - ok <- link src newkey - if ok - then do - -- Update symlink to use the new key. - liftIO $ removeFile file + Just (newkey, _) -> stopUnless (link src newkey) $ do + -- Update symlink to use the new key. + liftIO $ removeFile file - -- If the old key had some - -- associated urls, record them for - -- the new key as well. - urls <- getUrls oldkey - unless (null urls) $ - mapM_ (setUrlPresent newkey) urls + -- If the old key had some + -- associated urls, record them for + -- the new key as well. + urls <- getUrls oldkey + unless (null urls) $ + mapM_ (setUrlPresent newkey) urls - next $ Command.Add.cleanup file newkey True - else stop + next $ Command.Add.cleanup file newkey True where cleantmp t = liftIO $ whenM (doesFileExist t) $ removeFile t link src newkey = getViaTmpUnchecked newkey $ \t -> do diff --git a/Command/Move.hs b/Command/Move.hs index fd1ed9019..cc26eecda 100644 --- a/Command/Move.hs +++ b/Command/Move.hs @@ -108,17 +108,11 @@ toPerform dest move key = moveLock move key $ do fromStart :: Remote.Remote Annex -> Bool -> FilePath -> Key -> CommandStart fromStart src move file key | move = go - | otherwise = do - ishere <- inAnnex key - if ishere then stop else go + | otherwise = stopUnless (inAnnex key) go where - go = do - ok <- fromOk src key - if ok - then do - showMoveAction move file - next $ fromPerform src move key - else stop + go = stopUnless (fromOk src key) $ do + showMoveAction move file + next $ fromPerform src move key fromOk :: Remote.Remote Annex -> Key -> Annex Bool fromOk src key = do u <- getUUID diff --git a/Command/Unannex.hs b/Command/Unannex.hs index e97b6d05d..263ff88b4 100644 --- a/Command/Unannex.hs +++ b/Command/Unannex.hs @@ -25,21 +25,17 @@ seek = [withFilesInGit $ whenAnnexed start] {- The unannex subcommand undoes an add. -} start :: FilePath -> (Key, Backend Annex) -> CommandStart -start file (key, _) = do - ishere <- inAnnex key - if ishere - then do - force <- Annex.getState Annex.force - unless force $ do - top <- fromRepo Git.workTree - staged <- inRepo $ LsFiles.staged [top] - unless (null staged) $ - error "This command cannot be run when there are already files staged for commit." - Annex.changeState $ \s -> s { Annex.force = True } +start file (key, _) = stopUnless (inAnnex key) $ do + force <- Annex.getState Annex.force + unless force $ do + top <- fromRepo Git.workTree + staged <- inRepo $ LsFiles.staged [top] + unless (null staged) $ + error "This command cannot be run when there are already files staged for commit." + Annex.changeState $ \s -> s { Annex.force = True } - showStart "unannex" file - next $ perform file key - else stop + showStart "unannex" file + next $ perform file key perform :: FilePath -> Key -> CommandPerform perform file key = next $ cleanup file key -- cgit v1.2.3 From 14e9b87d44831fe0b6510422ffb4266f253496d7 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Fri, 9 Dec 2011 13:07:31 -0400 Subject: unannex improvements Added files don't have to be committed before they can be unannexed. unannex no longer commits existing staged changes unannex of the last file in a directory now works, before it failed because git rm deleted the directory out from under it, --- Command/Unannex.hs | 30 ++++++++++++------------------ debian/changelog | 2 ++ 2 files changed, 14 insertions(+), 18 deletions(-) diff --git a/Command/Unannex.hs b/Command/Unannex.hs index 263ff88b4..aafc189d0 100644 --- a/Command/Unannex.hs +++ b/Command/Unannex.hs @@ -23,17 +23,8 @@ def = [command "unannex" paramPaths seek "undo accidential add command"] seek :: [CommandSeek] seek = [withFilesInGit $ whenAnnexed start] -{- The unannex subcommand undoes an add. -} start :: FilePath -> (Key, Backend Annex) -> CommandStart start file (key, _) = stopUnless (inAnnex key) $ do - force <- Annex.getState Annex.force - unless force $ do - top <- fromRepo Git.workTree - staged <- inRepo $ LsFiles.staged [top] - unless (null staged) $ - error "This command cannot be run when there are already files staged for commit." - Annex.changeState $ \s -> s { Annex.force = True } - showStart "unannex" file next $ perform file key @@ -43,9 +34,17 @@ perform file key = next $ cleanup file key cleanup :: FilePath -> Key -> CommandCleanup cleanup file key = do liftIO $ removeFile file - inRepo $ Git.run "rm" [Params "--quiet --", File file] - -- git rm deletes empty directories; put them back - liftIO $ createDirectoryIfMissing True (parentDir file) + -- git rm deletes empty directory without --cached + inRepo $ Git.run "rm" [Params "--cached --quiet --", File file] + + -- If the file was already committed, it is now staged for removal. + -- Commit that removal now, to avoid later confusing the + -- pre-commit hook if this file is later added back to + -- git as a normal, non-annexed file. + whenM (not . null <$> inRepo (LsFiles.staged [file])) $ do + inRepo $ Git.run "commit" [ + Param "-m", Param "content removed from git annex", + Param "--", File file] fast <- Annex.getState Annex.fast if fast @@ -58,10 +57,5 @@ cleanup file key = do else do fromAnnex key file logStatus key InfoMissing - - -- Commit staged changes at end to avoid confusing the - -- pre-commit hook if this file is later added back to - -- git as a normal, non-annexed file. - Annex.Queue.add "commit" [Param "-m", Param "content removed from git annex"] [] - + return True diff --git a/debian/changelog b/debian/changelog index 04e911672..1c27ad566 100644 --- a/debian/changelog +++ b/debian/changelog @@ -7,6 +7,8 @@ git-annex (3.20111204) UNRELEASED; urgency=low yet added to git. Running the add again will now clean up this situtation. * Fix caching of decrypted ciphers, which failed when drop had to check multiple different encrypted special remotes. + * unannex: Can be run on files that have been added to the annex, but not + yet committed. -- Joey Hess Sun, 04 Dec 2011 12:22:37 -0400 -- cgit v1.2.3 From 252b2e92b0ddd5c97565d37c84771b471b6b5082 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Fri, 9 Dec 2011 13:31:51 -0400 Subject: cleanup --- Command/Unannex.hs | 1 - 1 file changed, 1 deletion(-) diff --git a/Command/Unannex.hs b/Command/Unannex.hs index aafc189d0..bed857b06 100644 --- a/Command/Unannex.hs +++ b/Command/Unannex.hs @@ -10,7 +10,6 @@ module Command.Unannex where import Common.Annex import Command import qualified Annex -import qualified Annex.Queue import Utility.FileMode import Logs.Location import Annex.Content -- cgit v1.2.3 From 95e748cbd4bb858a3b87621e60f5b43d53b50480 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Fri, 9 Dec 2011 13:32:09 -0400 Subject: inverted logic --- Command/DropKey.hs | 2 +- Command/Move.hs | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/Command/DropKey.hs b/Command/DropKey.hs index ae2ad8b6a..aaaa22466 100644 --- a/Command/DropKey.hs +++ b/Command/DropKey.hs @@ -21,7 +21,7 @@ seek :: [CommandSeek] seek = [withKeys start] start :: Key -> CommandStart -start key = stopUnless (not <$> inAnnex key) $ do +start key = stopUnless (inAnnex key) $ do unlessM (Annex.getState Annex.force) $ error "dropkey can cause data loss; use --force if you're sure you want to do this" showStart "dropkey" (show key) diff --git a/Command/Move.hs b/Command/Move.hs index cc26eecda..85fdff739 100644 --- a/Command/Move.hs +++ b/Command/Move.hs @@ -108,7 +108,7 @@ toPerform dest move key = moveLock move key $ do fromStart :: Remote.Remote Annex -> Bool -> FilePath -> Key -> CommandStart fromStart src move file key | move = go - | otherwise = stopUnless (inAnnex key) go + | otherwise = stopUnless (not <$> inAnnex key) go where go = stopUnless (fromOk src key) $ do showMoveAction move file -- cgit v1.2.3 From d69cf79e202b7d60465ce2068f736243aab14b3b Mon Sep 17 00:00:00 2001 From: "http://www.joachim-breitner.de/" Date: Fri, 9 Dec 2011 20:47:00 +0000 Subject: --- doc/forum/pure_git-annex_only_workflow.mdwn | 46 +++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) create mode 100644 doc/forum/pure_git-annex_only_workflow.mdwn diff --git a/doc/forum/pure_git-annex_only_workflow.mdwn b/doc/forum/pure_git-annex_only_workflow.mdwn new file mode 100644 index 000000000..e0cbcd538 --- /dev/null +++ b/doc/forum/pure_git-annex_only_workflow.mdwn @@ -0,0 +1,46 @@ +I’m using git annex to manage my movie collection on various devices – my laptop, a NSLU tucked away somewhere with lots of space, some external hard drives. For this use case, I do not need the full power of git as a version control system, so having to run "git commit" and coming up with commit messages is annoying. Also, this makes sense for a version control system, but not for my media collection: + + $ git annex add Hot\ Fuzz\ -\ English.mkv + add Hot Fuzz - English.mkv (checksum...) ok + (Recording state in git...) + $ git commit -m 'another movie added' + [master 851dc8a] another movie added + 1 files changed, 1 insertions(+), 0 deletions(-) + create mode 120000 00 Noch nicht gesehen/Hot Fuzz - English.mkv + $ git push jeff + Counting objects: 38, done. + Delta compression using up to 2 threads. + Compressing objects: 100% (20/20), done. + Writing objects: 100% (26/26), 2.00 KiB, done. + Total 26 (delta 11), reused 0 (delta 0) + remote: error: refusing to update checked out branch: refs/heads/master + remote: error: By default, updating the current branch in a non-bare repository + remote: error: is denied, because it will make the index and work tree inconsistent + remote: error: with what you pushed, and will require 'git reset --hard' to match + remote: error: the work tree to HEAD. + remote: error: + remote: error: You can set 'receive.denyCurrentBranch' configuration variable to + remote: error: 'ignore' or 'warn' in the remote repository to allow pushing into + remote: error: its current branch; however, this is not recommended unless you + remote: error: arranged to update its work tree to match what you pushed in some + remote: error: other way. + remote: error: + remote: error: To squelch this message and still keep the default behaviour, set + remote: error: 'receive.denyCurrentBranch' configuration variable to 'refuse'. + To jeff:/mnt/media/Movies + ! [rejected] git-annex -> git-annex (non-fast-forward) + ! [remote rejected] master -> master (branch is currently checked out) + error: failed to push some refs to 'jeff:/mnt/media/Movies' + To prevent you from losing history, non-fast-forward updates were rejected + Merge the remote changes (e.g. 'git pull') before pushing again. See the + 'Note about fast-forwards' section of 'git push --help' for details. + +It seems that to successfully make the new files known to the other side, I have to log into jeff and pull _from_ my currenct machine. + +What I would like to have is that + +* git annex add does not require a commit afterwards. +* Changes to the files are automatically picked up with the next git-annex call (similar to how etckeeper works). +* Commands "git annex push" and "git annex pull" that will syn the metadata (i.e. the list of files) in both directions without further manual intervention, at least not until the two repositories have diverged in a way that is not possible to merge sensible. + +Summay: git-annex is great. git is not always. Please make it possible to use git annex without having to use git. -- cgit v1.2.3 From 07c32dd49ce389cd0ba1888b42429ae5fe54a972 Mon Sep 17 00:00:00 2001 From: "http://www.joachim-breitner.de/" Date: Fri, 9 Dec 2011 20:57:00 +0000 Subject: typo --- doc/forum/pure_git-annex_only_workflow.mdwn | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/forum/pure_git-annex_only_workflow.mdwn b/doc/forum/pure_git-annex_only_workflow.mdwn index e0cbcd538..48226671c 100644 --- a/doc/forum/pure_git-annex_only_workflow.mdwn +++ b/doc/forum/pure_git-annex_only_workflow.mdwn @@ -35,7 +35,7 @@ I’m using git annex to manage my movie collection on various devices – my la Merge the remote changes (e.g. 'git pull') before pushing again. See the 'Note about fast-forwards' section of 'git push --help' for details. -It seems that to successfully make the new files known to the other side, I have to log into jeff and pull _from_ my currenct machine. +It seems that to successfully make the new files known to the other side, I have to log into jeff and pull _from_ my current machine. What I would like to have is that -- cgit v1.2.3 From 2cd22c1a13d6b4e1c6d2407d249f2f8d92643fbd Mon Sep 17 00:00:00 2001 From: "http://www.joachim-breitner.de/" Date: Fri, 9 Dec 2011 21:25:37 +0000 Subject: brr, so many typos today. --- doc/forum/pure_git-annex_only_workflow.mdwn | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/forum/pure_git-annex_only_workflow.mdwn b/doc/forum/pure_git-annex_only_workflow.mdwn index 48226671c..36648a905 100644 --- a/doc/forum/pure_git-annex_only_workflow.mdwn +++ b/doc/forum/pure_git-annex_only_workflow.mdwn @@ -41,6 +41,6 @@ What I would like to have is that * git annex add does not require a commit afterwards. * Changes to the files are automatically picked up with the next git-annex call (similar to how etckeeper works). -* Commands "git annex push" and "git annex pull" that will syn the metadata (i.e. the list of files) in both directions without further manual intervention, at least not until the two repositories have diverged in a way that is not possible to merge sensible. +* Commands "git annex push" and "git annex pull" that will sync the metadata (i.e. the list of files) in both directions without further manual intervention, at least not until the two repositories have diverged in a way that is not possible to merge sensible. Summay: git-annex is great. git is not always. Please make it possible to use git annex without having to use git. -- cgit v1.2.3 From 28699c95a7de284f07a5c0e34fb1755868301f3c Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Fri, 9 Dec 2011 18:10:41 -0400 Subject: some work on avoiding partial functions There are still hundreds of places that use partial functions head, tail, init, and last. --- Command/DropUnused.hs | 4 ++-- Git.hs | 18 ++++++------------ Utility/BadPrelude.hs | 24 ++++++++++++++++++++++++ Utility/Misc.hs | 13 +++++++++++++ 4 files changed, 45 insertions(+), 14 deletions(-) create mode 100644 Utility/BadPrelude.hs diff --git a/Command/DropUnused.hs b/Command/DropUnused.hs index 3df9ab6c2..244f378d9 100644 --- a/Command/DropUnused.hs +++ b/Command/DropUnused.hs @@ -73,6 +73,6 @@ readUnusedLog prefix = do then M.fromList . map parse . lines <$> liftIO (readFile f) else return M.empty where - parse line = (num, fromJust $ readKey $ tail rest) + parse line = (num, fromJust $ readKey rest) where - (num, rest) = break (== ' ') line + (num, rest) = separate (== ' ') line diff --git a/Git.hs b/Git.hs index 5bdd4afd4..84153be5d 100644 --- a/Git.hs +++ b/Git.hs @@ -507,11 +507,7 @@ configStore s repo = do configParse :: String -> M.Map String String configParse s = M.fromList $ map pair $ lines s where - pair l = (key l, val l) - key l = head $ keyval l - val l = join sep $ drop 1 $ keyval l - keyval l = split sep l :: [String] - sep = "=" + pair = separate (== '=') {- Calculates a list of a repo's configured remotes, by parsing its config. -} configRemotes :: Repo -> IO [Repo] @@ -550,13 +546,11 @@ genRemote s repo = gen $ calcloc s scpstyle v = ":" `isInfixOf` v && not ("//" `isInfixOf` v) scptourl v = "ssh://" ++ host ++ slash dir where - bits = split ":" v - host = head bits - dir = join ":" $ drop 1 bits - slash d | d == "" = "/~/" ++ dir - | head d == '/' = dir - | head d == '~' = '/':dir - | otherwise = "/~/" ++ dir + (host, dir) = separate (== ':') v + slash d | d == "" = "/~/" ++ d + | "/" `isPrefixOf` d = d + | "~" `isPrefixOf` d = '/':d + | otherwise = "/~/" ++ d {- Checks if a string from git config is a true value. -} configTrue :: String -> Bool diff --git a/Utility/BadPrelude.hs b/Utility/BadPrelude.hs new file mode 100644 index 000000000..1bb70adfb --- /dev/null +++ b/Utility/BadPrelude.hs @@ -0,0 +1,24 @@ +{- Some stuff from Prelude should not be used, as it tends to be a source + - of bugs. + - + - This exports functions that conflict with the prelude, which avoids + - them being accidentially used. + -} + +module Utility.BadPrelude where + +{- head is a partial function; head [] is an error -} +head :: [a] -> a +head = Prelude.head + +{- tail is also partial -} +tail :: [a] -> a +tail = Prelude.tail + +{- init too -} +init :: [a] -> a +init = Prelude.init + +{- last too -} +last :: [a] -> a +last = Prelude.last diff --git a/Utility/Misc.hs b/Utility/Misc.hs index 728598723..541e150b7 100644 --- a/Utility/Misc.hs +++ b/Utility/Misc.hs @@ -27,6 +27,19 @@ readMaybe s = case reads s of ((x,_):_) -> Just x _ -> Nothing +{- Like break, but the character matching the condition is not included + - in the second result list. + - + - separate (== ':') "foo:bar" = ("foo", "bar") + - separate (== ':') "foobar" = ("foo, "") + -} +separate :: (a -> Bool) -> [a] -> ([a], [a]) +separate c l = unbreak $ break c l + where + unbreak r@(a, b) + | null b = r + | otherwise = (a, tail b) + {- Catches IO errors and returns a Bool -} catchBoolIO :: IO Bool -> IO Bool catchBoolIO a = catchDefaultIO a False -- cgit v1.2.3 From be67294a1ebf1a49eddaf9698c662888d80d6a7a Mon Sep 17 00:00:00 2001 From: "http://joey.kitenet.net/" Date: Fri, 9 Dec 2011 22:56:11 +0000 Subject: Added a comment --- ...ent_1_a32f7efd18d174845099a4ed59e6feae._comment | 32 ++++++++++++++++++++++ 1 file changed, 32 insertions(+) create mode 100644 doc/forum/pure_git-annex_only_workflow/comment_1_a32f7efd18d174845099a4ed59e6feae._comment diff --git a/doc/forum/pure_git-annex_only_workflow/comment_1_a32f7efd18d174845099a4ed59e6feae._comment b/doc/forum/pure_git-annex_only_workflow/comment_1_a32f7efd18d174845099a4ed59e6feae._comment new file mode 100644 index 000000000..def1794a3 --- /dev/null +++ b/doc/forum/pure_git-annex_only_workflow/comment_1_a32f7efd18d174845099a4ed59e6feae._comment @@ -0,0 +1,32 @@ +[[!comment format=mdwn + username="http://joey.kitenet.net/" + nickname="joey" + subject="comment 1" + date="2011-12-09T22:56:11Z" + content=""" +First, you need a bare git repository that you can push to, and pull from. This simplifies most git workflow. + +Secondly, I use [mr](http://kitenet.net/~joey/code/mr/), with this in `.mrconfig`: + +
+[DEFAULT]
+lib =
+        annexupdate() {
+                git commit -a -m update || true
+                git pull \"$@\"
+                git annex merge
+                git push || true
+        }
+
+[lib/sound]
+update = annexupdate
+[lib/big]
+update = annexupdate
+
+ +Which makes \"mr update\" in repositories where I rarely care about git details take care of syncing my changes. + +I also make \"mr update\" do a \"git annex get\" of some files in some repositories that I want to always populate. git-annex and mr go well together. :) + +Perhaps my annexupdate above should be available as \"git annex sync\"? +"""]] -- cgit v1.2.3 From 4b3c4c0c2b88aa17a9b7822470e3e5dd2a3e7c2b Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Fri, 9 Dec 2011 18:57:09 -0400 Subject: avoid some read --- Config.hs | 7 ++++--- Git.hs | 2 +- Utility/BadPrelude.hs | 4 ++++ 3 files changed, 9 insertions(+), 4 deletions(-) diff --git a/Config.hs b/Config.hs index cc0c92953..c6107fc8e 100644 --- a/Config.hs +++ b/Config.hs @@ -79,9 +79,10 @@ repoNotIgnored r = not . Git.configTrue <$> getConfig r "ignore" "false" {- If a value is specified, it is used; otherwise the default is looked up - in git config. forcenumcopies overrides everything. -} getNumCopies :: Maybe Int -> Annex Int -getNumCopies v = - Annex.getState Annex.forcenumcopies >>= maybe (use v) (return . id) +getNumCopies v = perhaps (use v) =<< Annex.getState Annex.forcenumcopies where use (Just n) = return n - use Nothing = read <$> fromRepo (Git.configGet config "1") + use Nothing = perhaps (return 1) =<< + readMaybe <$> fromRepo (Git.configGet config "1") + perhaps fallback = maybe fallback (return . id) config = "annex.numcopies" diff --git a/Git.hs b/Git.hs index 84153be5d..8bc32b7cc 100644 --- a/Git.hs +++ b/Git.hs @@ -345,7 +345,7 @@ urlPort :: Repo -> Maybe Integer urlPort r = case urlAuthPart uriPort r of ":" -> Nothing - (':':p) -> Just (read p) + (':':p) -> readMaybe p _ -> Nothing {- Hostname of an URL repo, including any username (ie, "user@host") -} diff --git a/Utility/BadPrelude.hs b/Utility/BadPrelude.hs index 1bb70adfb..7921a7e9b 100644 --- a/Utility/BadPrelude.hs +++ b/Utility/BadPrelude.hs @@ -22,3 +22,7 @@ init = Prelude.init {- last too -} last :: [a] -> a last = Prelude.last + +{- read should be avoided, as it throws an error -} +read :: Read a => String -> a +read = Prelude.read -- cgit v1.2.3 From fb8231f3a1eb67e61a7df0381dc82be2f0ca0417 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Fri, 9 Dec 2011 20:27:22 -0400 Subject: sync: New command that synchronises the local repository and default remote, by running git commit, pull, and push for you. --- Command/Sync.hs | 64 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ GitAnnex.hs | 4 +++- debian/changelog | 2 ++ doc/git-annex.mdwn | 11 ++++++++++ 4 files changed, 80 insertions(+), 1 deletion(-) create mode 100644 Command/Sync.hs diff --git a/Command/Sync.hs b/Command/Sync.hs new file mode 100644 index 000000000..c0d7f37ad --- /dev/null +++ b/Command/Sync.hs @@ -0,0 +1,64 @@ +{- git-annex command + - + - Copyright 2011 Joey Hess + - + - Licensed under the GNU GPL version 3 or higher. + -} + +module Command.Sync where + +import Common.Annex +import Command +import qualified Annex.Branch +import qualified Git + +import qualified Data.ByteString.Lazy.Char8 as L + +def :: [Command] +def = [command "sync" paramPaths seek "synchronize local repository with remote"] + +seek :: [CommandSeek] +seek = [withNothing start] + +start :: CommandStart +start = do + showStart "sync" "." + showOutput + next perform + +perform :: CommandPerform +perform = do + remote <- defaultRemote =<< currentBranch + checkRemote remote + commit + inRepo $ Git.run "pull" [Param remote] + Annex.Branch.update + inRepo $ Git.run "push" [Param remote, matchingbranches] + next $ return True + where + -- git push may be configured to not push matching + -- branches; this should ensure it always does. + matchingbranches = Param ":" + +commit :: Annex () +commit = do + -- Commit will fail when the tree is clean (or when in a confliced + -- merge, etc). Ignore failure. + _ <- inRepo $ Git.runBool "commit" [Param "-a", Param "-m", Param "sync"] + return () + +-- the remote defaults to origin when not configured +defaultRemote :: String -> Annex String +defaultRemote branch = + fromRepo $ Git.configGet ("branch." ++ branch ++ ".remote") "origin" + +currentBranch :: Annex String +currentBranch = last . split "/" . L.unpack . head . L.lines <$> + inRepo (Git.pipeRead [Param "symbolic-ref", Param "HEAD"]) + +checkRemote :: String -> Annex () +checkRemote remote = do + remoteurl <- fromRepo $ + Git.configGet ("remote." ++ remote ++ ".url") "" + when (null remoteurl) $ do + error $ "No url is configured for the remote: " ++ remote diff --git a/GitAnnex.hs b/GitAnnex.hs index d768499dd..7871638e4 100644 --- a/GitAnnex.hs +++ b/GitAnnex.hs @@ -47,6 +47,7 @@ import qualified Command.Trust import qualified Command.Untrust import qualified Command.Semitrust import qualified Command.Dead +import qualified Command.Sync import qualified Command.AddUrl import qualified Command.Map import qualified Command.Upgrade @@ -61,6 +62,8 @@ cmds = concat , Command.Copy.def , Command.Unlock.def , Command.Lock.def + , Command.Sync.def + , Command.AddUrl.def , Command.Init.def , Command.Describe.def , Command.InitRemote.def @@ -72,7 +75,6 @@ cmds = concat , Command.Untrust.def , Command.Semitrust.def , Command.Dead.def - , Command.AddUrl.def , Command.FromKey.def , Command.DropKey.def , Command.Fix.def diff --git a/debian/changelog b/debian/changelog index 1c27ad566..b481d9999 100644 --- a/debian/changelog +++ b/debian/changelog @@ -9,6 +9,8 @@ git-annex (3.20111204) UNRELEASED; urgency=low multiple different encrypted special remotes. * unannex: Can be run on files that have been added to the annex, but not yet committed. + * sync: New command that synchronises the local repository and default + remote, by running git commit, pull, and push for you. -- Joey Hess Sun, 04 Dec 2011 12:22:37 -0400 diff --git a/doc/git-annex.mdwn b/doc/git-annex.mdwn index 08def0d62..d7a51663f 100644 --- a/doc/git-annex.mdwn +++ b/doc/git-annex.mdwn @@ -120,6 +120,17 @@ subdirectories). Use this to undo an unlock command if you don't want to modify the files, or have made modifications you want to discard. +* sync + + Use this command when you want to synchronize the local repository + with its default remote (typically "origin"). The sync process involves + first committing all local changes, then pulling and merging any changes + from the remote, and finally pushing the repository's state to the remote. + You can use standard git commands to do each of those steps by hand, + or if you don't want to worry about the details, you can use sync. + + Note that sync does not transfer any file contents from or to the remote. + * addurl [url ...] Downloads each url to a file, which is added to the annex. -- cgit v1.2.3 From 4200b8038a79aa788ce0b62579bd5820b7d75e3b Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Sat, 10 Dec 2011 12:21:22 -0400 Subject: separate operations --- Command/Sync.hs | 54 ++++++++++++++++++++++++++++++------------------------ 1 file changed, 30 insertions(+), 24 deletions(-) diff --git a/Command/Sync.hs b/Command/Sync.hs index c0d7f37ad..7dc5f4d24 100644 --- a/Command/Sync.hs +++ b/Command/Sync.hs @@ -17,39 +17,45 @@ import qualified Data.ByteString.Lazy.Char8 as L def :: [Command] def = [command "sync" paramPaths seek "synchronize local repository with remote"] +-- syncing involves several operations, any of which can independantly fail seek :: [CommandSeek] -seek = [withNothing start] +seek = map withNothing [commit, pull, push] -start :: CommandStart -start = do - showStart "sync" "." - showOutput - next perform +commit :: CommandStart +commit = do + showStart "commit" "" + next $ next $ do + showOutput + -- Commit will fail when the tree is clean, so ignore failure. + _ <- inRepo $ Git.runBool "commit" [Param "-a", Param "-m", Param "sync"] + return True + +pull :: CommandStart +pull = do + remote <- defaultRemote + showStart "pull" remote + next $ next $ do + showOutput + checkRemote remote + inRepo $ Git.runBool "pull" [Param remote] -perform :: CommandPerform -perform = do - remote <- defaultRemote =<< currentBranch - checkRemote remote - commit - inRepo $ Git.run "pull" [Param remote] - Annex.Branch.update - inRepo $ Git.run "push" [Param remote, matchingbranches] - next $ return True +push :: CommandStart +push = do + remote <- defaultRemote + showStart "push" remote + next $ next $ do + Annex.Branch.update + showOutput + inRepo $ Git.runBool "push" [Param remote, matchingbranches] where -- git push may be configured to not push matching -- branches; this should ensure it always does. matchingbranches = Param ":" -commit :: Annex () -commit = do - -- Commit will fail when the tree is clean (or when in a confliced - -- merge, etc). Ignore failure. - _ <- inRepo $ Git.runBool "commit" [Param "-a", Param "-m", Param "sync"] - return () - -- the remote defaults to origin when not configured -defaultRemote :: String -> Annex String -defaultRemote branch = +defaultRemote :: Annex String +defaultRemote = do + branch <- currentBranch fromRepo $ Git.configGet ("branch." ++ branch ++ ".remote") "origin" currentBranch :: Annex String -- cgit v1.2.3 From e0e9d1cabd0aca52aa410be37de7251533b5e237 Mon Sep 17 00:00:00 2001 From: "http://www.joachim-breitner.de/" Date: Sat, 10 Dec 2011 16:28:29 +0000 Subject: Added a comment --- .../comment_2_66dc9b65523a9912411db03c039ba848._comment | 15 +++++++++++++++ 1 file changed, 15 insertions(+) create mode 100644 doc/forum/pure_git-annex_only_workflow/comment_2_66dc9b65523a9912411db03c039ba848._comment diff --git a/doc/forum/pure_git-annex_only_workflow/comment_2_66dc9b65523a9912411db03c039ba848._comment b/doc/forum/pure_git-annex_only_workflow/comment_2_66dc9b65523a9912411db03c039ba848._comment new file mode 100644 index 000000000..473a0287d --- /dev/null +++ b/doc/forum/pure_git-annex_only_workflow/comment_2_66dc9b65523a9912411db03c039ba848._comment @@ -0,0 +1,15 @@ +[[!comment format=mdwn + username="http://www.joachim-breitner.de/" + nickname="nomeata" + subject="comment 2" + date="2011-12-10T16:28:29Z" + content=""" +Thanks for the tips so far. I guess a bare-only repo helps, but as well is something that I don’t _need_ (for my use case), any only have to do because git works like this. + +Also, if I have a mobile device that I want to push to, then I’d have to have two repositories on the device, as I might not be able to reach my main bare repository when traveling, but I cannot push to the „real“ repo on the mobile device from my computer. I guess I am spoiled by darcs, which will happily push to a checked out +remote repository, updating the checkout if possible without conflict. + +If I introduce a central bare repository to push to and from; I’d still have to have the other non-bare repos as remotes, so that git-annex will know about them and their files, right? + +I’d appreciate a \"git annex sync\" that does what you described (commit all, pull, merge, push). Especially if it comes in a \"git annex sync --all\" variant that syncs all reachable repositories. +"""]] -- cgit v1.2.3 From eecaf424856e42290d653e0b4b4043665da18b00 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Sat, 10 Dec 2011 12:30:31 -0400 Subject: no need to show, it's a string --- Annex/Branch.hs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Annex/Branch.hs b/Annex/Branch.hs index 8f0a09fd6..37237ec66 100644 --- a/Annex/Branch.hs +++ b/Annex/Branch.hs @@ -141,7 +141,7 @@ update = onceonly $ do let merge_desc = if null branches then "update" else "merging " ++ - unwords (map (show . Git.refDescribe) branches) ++ + unwords (map Git.refDescribe branches) ++ " into " ++ show name unless (null branches) $ do showSideAction merge_desc -- cgit v1.2.3 From c5267802f38eb020edee8143d5a097296232bb35 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Sat, 10 Dec 2011 12:49:44 -0400 Subject: version dependency on old monad-control This should let cabal build it with the right version. --- debian/changelog | 1 + git-annex.cabal | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/debian/changelog b/debian/changelog index b481d9999..8dbcb07dc 100644 --- a/debian/changelog +++ b/debian/changelog @@ -11,6 +11,7 @@ git-annex (3.20111204) UNRELEASED; urgency=low yet committed. * sync: New command that synchronises the local repository and default remote, by running git commit, pull, and push for you. + * Version monad-control dependency in cabal file. -- Joey Hess Sun, 04 Dec 2011 12:22:37 -0400 diff --git a/git-annex.cabal b/git-annex.cabal index 7be78053f..8f5d6ebb8 100644 --- a/git-annex.cabal +++ b/git-annex.cabal @@ -31,7 +31,7 @@ Executable git-annex Build-Depends: MissingH, hslogger, directory, filepath, unix, containers, utf8-string, network, mtl, bytestring, old-locale, time, pcre-light, extensible-exceptions, dataenc, SHA, process, hS3, HTTP, - base < 5, monad-control, json + base < 5, monad-control < 0.3, json Executable git-annex-shell Main-Is: git-annex-shell.hs -- cgit v1.2.3 From 6cf28585b66264e174eee45496e01d23667e0d20 Mon Sep 17 00:00:00 2001 From: "http://joey.kitenet.net/" Date: Sat, 10 Dec 2011 19:43:04 +0000 Subject: Added a comment --- .../comment_3_9b7d89da52f7ebb7801f9ec8545c3aba._comment | 12 ++++++++++++ 1 file changed, 12 insertions(+) create mode 100644 doc/forum/pure_git-annex_only_workflow/comment_3_9b7d89da52f7ebb7801f9ec8545c3aba._comment diff --git a/doc/forum/pure_git-annex_only_workflow/comment_3_9b7d89da52f7ebb7801f9ec8545c3aba._comment b/doc/forum/pure_git-annex_only_workflow/comment_3_9b7d89da52f7ebb7801f9ec8545c3aba._comment new file mode 100644 index 000000000..9b6e6d7c4 --- /dev/null +++ b/doc/forum/pure_git-annex_only_workflow/comment_3_9b7d89da52f7ebb7801f9ec8545c3aba._comment @@ -0,0 +1,12 @@ +[[!comment format=mdwn + username="http://joey.kitenet.net/" + nickname="joey" + subject="comment 3" + date="2011-12-10T19:43:04Z" + content=""" +Git can actually push into a non-bare repository, so long as the branch you change there is not a checked out one. Pushing into `remotes/$foo/master` and `remotes/$foo/git-annex` would work, however determining the value that the repository expects for `$foo` is something git cannot do on its own. And of course you'd still have to `git merge remotes/$foo/master` to get the changes. + +Yes, you still keep the non-bare repos as remotes when adding a bare repository, so git-annex knows how to get to them. + +I've made `git annex sync` run the simple script above. Perhaps it can later be improved to sync all repositories. +"""]] -- cgit v1.2.3 From bfdc9f28fc6ef19f7e3ec2ea724a540af104a386 Mon Sep 17 00:00:00 2001 From: "http://schnouki.net/" Date: Sat, 10 Dec 2011 21:40:01 +0000 Subject: --- ...t___34__git-annex_get__34___with_3.20111203.mdwn | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) create mode 100644 doc/bugs/Can__39__t___34__git-annex_get__34___with_3.20111203.mdwn diff --git a/doc/bugs/Can__39__t___34__git-annex_get__34___with_3.20111203.mdwn b/doc/bugs/Can__39__t___34__git-annex_get__34___with_3.20111203.mdwn new file mode 100644 index 000000000..2e7771386 --- /dev/null +++ b/doc/bugs/Can__39__t___34__git-annex_get__34___with_3.20111203.mdwn @@ -0,0 +1,21 @@ +Hi there, + +After updating to 3.20111203 (on Arch Linux) I noticed I was not able to use `git annex get` from a SSH remote (server running Arch Linux, same version of git-annex): "requested key is not present". Same behavior with current master (commit 6cf28585). I had no issue with the previous version (3.20111122). + +On this server, I was able to track down the issue using `git-annex-shell inannex` and `strace`: + + $ strace -f -o log git-annex-shell inannex ~/photos-annex.git WORM-s369360-m1321602916--2011-11-17.jpg + $ echo $? + 1 + $ tail -n20 log + [...] + 25623 chdir("/home/schnouki/git-annex") = 0 + 25623 stat("/home/schnouki/photos-annex.git/annex/objects/082/676/WORM-s369360-m1321602916--2011-11-17.jpg/WORM-s369360-m1321602916--2011-11-17.jpg", {st_mode=S_IFREG|0400, st_size=369360, ...}) = 0 + 25623 open("annex/objects/082/676/WORM-s369360-m1321602916--2011-11-17.jpg/WORM-s369360-m1321602916--2011-11-17.jpg", O_RDONLY) = -1 ENOENT (No such file or directory) + [...] + +Note there is a call to `stat()` with the full path to the requested file, and *then* a call to `open()` with a relative path -- which calls this call to fail, and git-annex-shell to return 1. With 3.20111122, there was no call to `stat()`, just a successful call to `open()` with a full absolute path. + +Using `git bisect` I was able to determine that this bug appeared in commit 64672c62 ("refactor"). Reverting it makes `git-annex-shell` work as expected, but I'm sure there are better ways to fix this. However I don't know enough Haskell to do it myself. + +Could you please try to fix this in a future version? -- cgit v1.2.3 From 10e8028a42fe65e8586b7424557b792cf36e215c Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Sat, 10 Dec 2011 18:45:55 -0400 Subject: Fix bug in last version in getting contents from bare repositories. --- Locations.hs | 9 ++++----- debian/changelog | 1 + .../Can__39__t___34__git-annex_get__34___with_3.20111203.mdwn | 5 +++++ 3 files changed, 10 insertions(+), 5 deletions(-) diff --git a/Locations.hs b/Locations.hs index 3843495f9..080766841 100644 --- a/Locations.hs +++ b/Locations.hs @@ -80,16 +80,15 @@ gitAnnexLocation key r | Git.repoIsLocalBare r = {- Bare repositories default to hashDirLower for new - content, as it's more portable. -} - go (Git.workTree r) (annexLocations key) + check (map inrepo $ annexLocations key) | otherwise = {- Non-bare repositories only use hashDirMixed, so - don't need to do any work to check if the file is - present. -} - return $ Git.workTree r ".git" - annexLocation key hashDirMixed + return $ inrepo ".git" annexLocation key hashDirMixed where - go dir locs = fromMaybe (dir head locs) <$> check dir locs - check dir = firstM $ \f -> doesFileExist $ dir f + inrepo = () (Git.workTree r) + check locs = fromMaybe (head locs) <$> firstM doesFileExist locs {- The annex directory of a repository. -} gitAnnexDir :: Git.Repo -> FilePath diff --git a/debian/changelog b/debian/changelog index 8dbcb07dc..5c68fde2d 100644 --- a/debian/changelog +++ b/debian/changelog @@ -12,6 +12,7 @@ git-annex (3.20111204) UNRELEASED; urgency=low * sync: New command that synchronises the local repository and default remote, by running git commit, pull, and push for you. * Version monad-control dependency in cabal file. + * Fix bug in last version in getting contents from bare repositories. -- Joey Hess Sun, 04 Dec 2011 12:22:37 -0400 diff --git a/doc/bugs/Can__39__t___34__git-annex_get__34___with_3.20111203.mdwn b/doc/bugs/Can__39__t___34__git-annex_get__34___with_3.20111203.mdwn index 2e7771386..00ab9f3f0 100644 --- a/doc/bugs/Can__39__t___34__git-annex_get__34___with_3.20111203.mdwn +++ b/doc/bugs/Can__39__t___34__git-annex_get__34___with_3.20111203.mdwn @@ -19,3 +19,8 @@ Note there is a call to `stat()` with the full path to the requested file, and * Using `git bisect` I was able to determine that this bug appeared in commit 64672c62 ("refactor"). Reverting it makes `git-annex-shell` work as expected, but I'm sure there are better ways to fix this. However I don't know enough Haskell to do it myself. Could you please try to fix this in a future version? + +> Thanks for a very good bug report. +> +> I've fixed this stupid mistake introduced in the code refactoring. +> --[[Joey]] -- cgit v1.2.3 From 9ba99a544b79f9304839a5f4e756d8025e9f18f0 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Sat, 10 Dec 2011 18:51:01 -0400 Subject: update --- Remote/Bup.hs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Remote/Bup.hs b/Remote/Bup.hs index 589dea91d..e705bbb34 100644 --- a/Remote/Bup.hs +++ b/Remote/Bup.hs @@ -182,7 +182,7 @@ onBupRemote r a command params = do - local bup repositories to see if they are available, and getting their - uuid (which may be different from the stored uuid for the bup remote). - - - If a bup repository is not available, returns a dummy uuid of "". + - If a bup repository is not available, returns NoUUID. - This will cause checkPresent to indicate nothing from the bup remote - is known to be present. - -- cgit v1.2.3 From f44f715f51aad6cf164501b0d53b7163f3a5e758 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Sat, 10 Dec 2011 19:38:27 -0400 Subject: ensure local remote is initialized when copying to it Needed due to this scenario: Bare repo origin is made, foo is cloned from it; foo is initalized; a file is added to foo's annex; git annex move --to origin Since the git-annex branch has not yet been pushed to origin, it doesn't auto-initialize. When the content is sent to it, it's stored, but the remote has NoUUID, and so nothing is logged in the location log. Then the content is removed from the local repo, and git-annex has lost track of it. git annex fsck in origin will find the lost content, but let's not let this happen. Content should only be sent to initalized remotes. This cannot happen for non-local remotes, since git-annex-shell always checks that the repo is initialized. --- Remote/Git.hs | 1 + 1 file changed, 1 insertion(+) diff --git a/Remote/Git.hs b/Remote/Git.hs index d172ec7c0..05743a28d 100644 --- a/Remote/Git.hs +++ b/Remote/Git.hs @@ -210,6 +210,7 @@ copyToRemote r key params <- rsyncParams r -- run copy from perspective of remote liftIO $ onLocal r $ do + ensureInitialized ok <- Annex.Content.getViaTmp key $ rsyncOrCopyFile params keysrc Annex.Content.saveState -- cgit v1.2.3 From 583ba809923948eb3ce9b3f91892c7492e5db14f Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Sat, 10 Dec 2011 20:53:42 -0400 Subject: better syntax --- Locations.hs | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Locations.hs b/Locations.hs index 080766841..f7784a56d 100644 --- a/Locations.hs +++ b/Locations.hs @@ -87,7 +87,7 @@ gitAnnexLocation key r - present. -} return $ inrepo ".git" annexLocation key hashDirMixed where - inrepo = () (Git.workTree r) + inrepo d = Git.workTree r d check locs = fromMaybe (head locs) <$> firstM doesFileExist locs {- The annex directory of a repository. -} -- cgit v1.2.3 From bf6a3b757adebe92b330c4216fb32fbf1a6d64d2 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Sun, 11 Dec 2011 01:15:27 -0400 Subject: close --- doc/bugs/Can__39__t___34__git-annex_get__34___with_3.20111203.mdwn | 1 + 1 file changed, 1 insertion(+) diff --git a/doc/bugs/Can__39__t___34__git-annex_get__34___with_3.20111203.mdwn b/doc/bugs/Can__39__t___34__git-annex_get__34___with_3.20111203.mdwn index 00ab9f3f0..ea56c3732 100644 --- a/doc/bugs/Can__39__t___34__git-annex_get__34___with_3.20111203.mdwn +++ b/doc/bugs/Can__39__t___34__git-annex_get__34___with_3.20111203.mdwn @@ -23,4 +23,5 @@ Could you please try to fix this in a future version? > Thanks for a very good bug report. > > I've fixed this stupid mistake introduced in the code refactoring. +> [[done]] > --[[Joey]] -- cgit v1.2.3 From 0ba4b1de18acf9e2318da6c8e7d80ce4bb216850 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Sun, 11 Dec 2011 14:14:28 -0400 Subject: move a file location to Locations.hs --- Annex/Branch.hs | 6 +----- Locations.hs | 5 +++++ 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/Annex/Branch.hs b/Annex/Branch.hs index 37237ec66..44ee69e62 100644 --- a/Annex/Branch.hs +++ b/Annex/Branch.hs @@ -43,10 +43,6 @@ fullname = Git.Ref $ "refs/heads/" ++ show name originname :: Git.Ref originname = Git.Ref $ "origin/" ++ show name -{- A separate index file for the branch. -} -index :: Git.Repo -> FilePath -index g = gitAnnexDir g "index" - {- Populates the branch's index file with the current branch contents. - - Usually, this is only done when the index doesn't yet exist, and @@ -62,7 +58,7 @@ withIndex :: Annex a -> Annex a withIndex = withIndex' False withIndex' :: Bool -> Annex a -> Annex a withIndex' bootstrapping a = do - f <- fromRepo index + f <- fromRepo gitAnnexIndex bracketIO (Git.useIndex f) id $ do unlessM (liftIO $ doesFileExist f) $ do unless bootstrapping create diff --git a/Locations.hs b/Locations.hs index f7784a56d..fdb823284 100644 --- a/Locations.hs +++ b/Locations.hs @@ -20,6 +20,7 @@ module Locations ( gitAnnexUnusedLog, gitAnnexJournalDir, gitAnnexJournalLock, + gitAnnexIndex, isLinkToAnnex, annexHashes, hashDirMixed, @@ -131,6 +132,10 @@ gitAnnexJournalDir r = addTrailingPathSeparator $ gitAnnexDir r "journal" gitAnnexJournalLock :: Git.Repo -> FilePath gitAnnexJournalLock r = gitAnnexDir r "journal.lck" +{- .git/annex/index is used to stage changes to the git-annex branch -} +gitAnnexIndex :: Git.Repo -> FilePath +gitAnnexIndex r = gitAnnexDir r "index" + {- Checks a symlink target to see if it appears to point to annexed content. -} isLinkToAnnex :: FilePath -> Bool isLinkToAnnex s = ("/.git/" ++ objectDir) `isInfixOf` s -- cgit v1.2.3 From 59971c923029a6f10c47a526e7878130637c139e Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Sun, 11 Dec 2011 14:47:44 -0400 Subject: new bug --- doc/bugs/git-annex_branch_corruption.mdwn | 89 +++++++++++++++++++++++++++++++ 1 file changed, 89 insertions(+) create mode 100644 doc/bugs/git-annex_branch_corruption.mdwn diff --git a/doc/bugs/git-annex_branch_corruption.mdwn b/doc/bugs/git-annex_branch_corruption.mdwn new file mode 100644 index 000000000..63633fb08 --- /dev/null +++ b/doc/bugs/git-annex_branch_corruption.mdwn @@ -0,0 +1,89 @@ +Below is a test case which shows a way that the git-annex branch +can become corrupted and lose data, including location log records and +uuid.log lines. + +At the end, a commit on the git-annex branch removes one of the 2 lines +from the uuid.log; which should never happen. + +The actual problem occurs earlier, at the "push point". Here a repo is +cloned from the main one, initialized (adding the last uuid.log line), +and then pushed back to the main one. That push is a fast-forward, so is +allowed to directly update the git-annex branch in the main repo: + + b884fe5..c497739 git-annex -> git-annex + +Now the git-annex branch has a change that is not reflected in +`.git/annex/index`, so the next time a change is made, it's committed +using the out of date index, which causes a reversion of the changes +that were pushed to the branch. + +--- + +## Thoughts + +This is essentially the same reason why git blocks pushes to the checked-out +branch of a non-bare repository. + +This problem only affects workflows that involve pushing. Pulling workflows +do not directly update the local git-annex branch, so avoid the problem. + +And while bare repos are pushed to, they rarely have changes made directly +to their git-annex branches, so while I think the same problem could +happen with pushing to a bare repo, it's unlikely. + +None of which is to say this is not a bad bug that needs to be comprehensively +fixed. + +Probably git-annex needs to record which ref of the git-annex branch +corresponds to its index, and if the branch is at a different ref, +merge it into the index. I am still considering how to do that atomically; +what if a push comes in while git-annex is updating its index? + +--- + +## Workaround + +Users who want to prevent this bug from occuring when pushing to their +non-bare repositories can install this script as `.git/hooks/update` + +
+#!/bin/sh
+if [ "$1" = refs/heads/git-annex ]; then
+	exit 1
+fi
+
+ +--[[Joey]] + +--- + +## Test Case +
+#!/bin/sh
+mkdir annextest
+cd annextest
+
+git init dir1
+cd dir1
+git annex init
+touch foo 
+echo hi > bar
+git annex add
+git commit -m add
+
+cd ..
+git clone dir1 dir2
+cd dir2
+git annex init otherdir
+git annex get
+# push point
+git push
+
+cd ..
+cd dir1
+echo "before"
+git show git-annex:uuid.log
+git annex drop foo --force
+echo "after"
+git show git-annex:uuid.log
+
-- cgit v1.2.3 From 8680c415dedcc1fdbbcc0b9cdd7e37e860b22130 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Sun, 11 Dec 2011 14:51:20 -0400 Subject: slow, stupid, and safe index updating Always merge the git-annex branch into .git/annex/index before making a commit from the index. This ensures that, when the branch has been changed in any way (by a push being received, or changes pulled directly into it, or even by the user checking it out, and committing a change), the index reflects those changes. This is much too slow; it needs to be optimised to only update the index when the branch has really changed, not every time. Also, there is an unhandled race, when a change is made to the branch right after the index gets updated. I left it in for now because it's unlikely and I didn't want to complicate things with additional locking yet. --- Annex/Branch.hs | 37 +++++++++++++++++++------------ debian/changelog | 5 +++++ doc/bugs/git-annex_branch_corruption.mdwn | 3 +++ doc/internals.mdwn | 14 +++--------- 4 files changed, 34 insertions(+), 25 deletions(-) diff --git a/Annex/Branch.hs b/Annex/Branch.hs index 44ee69e62..be19a9be0 100644 --- a/Annex/Branch.hs +++ b/Annex/Branch.hs @@ -45,14 +45,26 @@ originname = Git.Ref $ "origin/" ++ show name {- Populates the branch's index file with the current branch contents. - - - Usually, this is only done when the index doesn't yet exist, and - - the index is used to build up changes to be commited to the branch, - - and merge in changes from other branches. + - This is only done when the index doesn't yet exist, and the index + - is used to build up changes to be commited to the branch, and merge + - in changes from other branches. -} genIndex :: Git.Repo -> IO () genIndex g = Git.UnionMerge.stream_update_index g [Git.UnionMerge.ls_tree fullname g] +{- Merges the specified branches into the index. + - Any changes staged in the index will be preserved. -} +mergeIndex :: [Git.Ref] -> Annex () +mergeIndex branches = do + h <- catFileHandle + inRepo $ \g -> Git.UnionMerge.merge_index h g branches + +{- Updates the branch's index to reflect the current contents of the branch. + - Any changes staged in the index will be preserved. -} +updateIndex :: Annex () +updateIndex = withIndex $ mergeIndex [fullname] + {- Runs an action using the branch's index file. -} withIndex :: Annex a -> Annex a withIndex = withIndex' False @@ -66,6 +78,8 @@ withIndex' bootstrapping a = do unless bootstrapping $ inRepo genIndex a +{- Runs an action using the branch's index file, first making sure that + - the branch and index are up-to-date. -} withIndexUpdate :: Annex a -> Annex a withIndexUpdate a = update >> withIndex a @@ -106,11 +120,12 @@ create = unlessM hasBranch $ do {- Stages the journal, and commits staged changes to the branch. -} commit :: String -> Annex () commit message = whenM journalDirty $ lockJournal $ do + updateIndex stageJournalFiles withIndex $ inRepo $ Git.commit message fullname [fullname] -{- Ensures that the branch is up-to-date; should be called before data is - - read from it. Runs only once per git-annex run. +{- Ensures that the branch and index are is up-to-date; should be + - called before data is read from it. Runs only once per git-annex run. - - Before refs are merged into the index, it's important to first stage the - journal into the index. Otherwise, any changes in the journal would @@ -126,8 +141,9 @@ commit message = whenM journalDirty $ lockJournal $ do -} update :: Annex () update = onceonly $ do - -- ensure branch exists + -- ensure branch exists, and index is up-to-date create + updateIndex -- check what needs updating before taking the lock dirty <- journalDirty c <- filterM (changedBranch name . snd) =<< siblingBranches @@ -141,14 +157,7 @@ update = onceonly $ do " into " ++ show name unless (null branches) $ do showSideAction merge_desc - {- Note: This merges the branches into the index. - - Any unstaged changes in the git-annex branch - - (if it's checked out) will be removed. So, - - documentation advises users not to directly - - modify the branch. - -} - h <- catFileHandle - inRepo $ \g -> Git.UnionMerge.merge_index h g branches + mergeIndex branches ff <- if dirty then return False else tryFastForwardTo refs unless ff $ inRepo $ Git.commit merge_desc fullname (nub $ fullname:refs) diff --git a/debian/changelog b/debian/changelog index 5c68fde2d..405e98b74 100644 --- a/debian/changelog +++ b/debian/changelog @@ -13,6 +13,11 @@ git-annex (3.20111204) UNRELEASED; urgency=low remote, by running git commit, pull, and push for you. * Version monad-control dependency in cabal file. * Fix bug in last version in getting contents from bare repositories. + * Ensure that git-annex branch changes are merged into git-annex's index, + which fixes a bug that could cause changes that were pushed to the + git-annex branch to get reverted. As a side effect, it's now safe + for users to check out and commit changes directly to the git-annex + branch. -- Joey Hess Sun, 04 Dec 2011 12:22:37 -0400 diff --git a/doc/bugs/git-annex_branch_corruption.mdwn b/doc/bugs/git-annex_branch_corruption.mdwn index 63633fb08..5249e63d8 100644 --- a/doc/bugs/git-annex_branch_corruption.mdwn +++ b/doc/bugs/git-annex_branch_corruption.mdwn @@ -39,6 +39,9 @@ corresponds to its index, and if the branch is at a different ref, merge it into the index. I am still considering how to do that atomically; what if a push comes in while git-annex is updating its index? +> Now git-annex always updates the index with the git-annex branch, which +> is a slow, but safe way to avoid this problem. [[done]] --[[Joey]] + --- ## Workaround diff --git a/doc/internals.mdwn b/doc/internals.mdwn index d84b3c489..68cc7c3cd 100644 --- a/doc/internals.mdwn +++ b/doc/internals.mdwn @@ -22,17 +22,9 @@ deleting or changing the file contents. This branch is managed by git-annex, with the contents listed below. The file `.git/annex/index` is a separate git index file it uses -to accumulate changes for the git-annex. Also, `.git/annex/journal/` is used -to record changes before they are added to git. - -Note that for speed reasons, git-annex assumes only it will modify this -branch. If you go in and make changes directly, it will probably revert -your changes in its next commit to the branch. - -The best way to make changes to the git-annex branch is instead -to create a branch of it, with a name like "my/git-annex", and then -use "git annex merge" to automerge your branch into the main git-annex -branch. +to accumulate changes for the git-annex branch. +Also, `.git/annex/journal/` is used to record changes before they +are added to git. ### `uuid.log` -- cgit v1.2.3 From cfbbda99f4dd3e510e52dbb499d132300ad203e4 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Sun, 11 Dec 2011 16:11:13 -0400 Subject: optimize index updating The last branch ref that the index was updated to is stored in .git/annex/index.lck, and the index only updated when the current branch ref differs. (The .lck file should later be used for locking too.) Some more optimization is still needed, since there is some redundancy in calls to git show-ref. --- Annex/Branch.hs | 54 +++++++++++++++++++++++++++++++++++++++++++----------- Git.hs | 5 +++-- Locations.hs | 5 +++++ git-union-merge.hs | 2 +- 4 files changed, 52 insertions(+), 14 deletions(-) diff --git a/Annex/Branch.hs b/Annex/Branch.hs index be19a9be0..52e82e25c 100644 --- a/Annex/Branch.hs +++ b/Annex/Branch.hs @@ -61,9 +61,26 @@ mergeIndex branches = do inRepo $ \g -> Git.UnionMerge.merge_index h g branches {- Updates the branch's index to reflect the current contents of the branch. - - Any changes staged in the index will be preserved. -} + - Any changes staged in the index will be preserved. + - + - Compares the ref stored in the lock file with the current + - ref of the branch to see if an update is needed. + -} updateIndex :: Annex () -updateIndex = withIndex $ mergeIndex [fullname] +updateIndex = do + lock <- fromRepo gitAnnexIndexLock + lockref <- firstRef <$> liftIO (catchDefaultIO (readFileStrict lock) "") + branchref <- getRef fullname + when (lockref /= branchref) $ do + withIndex $ mergeIndex [fullname] + setIndexRef branchref + +{- Record that the branch's index has been updated to correspond to a + - given ref of the branch. -} +setIndexRef :: Git.Ref -> Annex () +setIndexRef ref = do + lock <- fromRepo gitAnnexIndexLock + liftIO $ writeFile lock $ show ref ++ "\n" {- Runs an action using the branch's index file. -} withIndex :: Annex a -> Annex a @@ -109,12 +126,13 @@ getCache file = getState >>= go {- Creates the branch, if it does not already exist. -} create :: Annex () -create = unlessM hasBranch $ do - e <- hasOrigin - if e - then inRepo $ Git.run "branch" - [Param $ show name, Param $ show originname] - else withIndex' True $ +create = unlessM hasBranch $ hasOrigin >>= go >>= setIndexRef + where + go True = do + inRepo $ Git.run "branch" + [Param $ show name, Param $ show originname] + getRef fullname + go False = withIndex' True $ inRepo $ Git.commit "branch created" fullname [] {- Stages the journal, and commits staged changes to the branch. -} @@ -122,7 +140,8 @@ commit :: String -> Annex () commit message = whenM journalDirty $ lockJournal $ do updateIndex stageJournalFiles - withIndex $ inRepo $ Git.commit message fullname [fullname] + withIndex $ + setIndexRef =<< inRepo (Git.commit message fullname [fullname]) {- Ensures that the branch and index are is up-to-date; should be - called before data is read from it. Runs only once per git-annex run. @@ -159,8 +178,9 @@ update = onceonly $ do showSideAction merge_desc mergeIndex branches ff <- if dirty then return False else tryFastForwardTo refs - unless ff $ inRepo $ - Git.commit merge_desc fullname (nub $ fullname:refs) + unless ff $ + setIndexRef =<< + inRepo (Git.commit merge_desc fullname (nub $ fullname:refs)) invalidateCache where onceonly a = unlessM (branchUpdated <$> getState) $ do @@ -253,6 +273,18 @@ siblingBranches = do gen l = (Git.Ref $ head l, Git.Ref $ last l) uref (a, _) (b, _) = a == b +{- Get the ref of a branch. -} +getRef :: Git.Ref -> Annex Git.Ref +getRef branch = firstRef . L.unpack <$> showref + where + showref = inRepo $ Git.pipeRead [Param "show-ref", + Param "--hash", -- get the hash + Param "--verify", -- only exact match + Param $ show branch] + +firstRef :: String-> Git.Ref +firstRef = Git.Ref . takeWhile (/= '\n') + {- Applies a function to modifiy the content of a file. - - Note that this does not cause the branch to be merged, it only diff --git a/Git.hs b/Git.hs index 8bc32b7cc..1da5997c1 100644 --- a/Git.hs +++ b/Git.hs @@ -463,8 +463,8 @@ shaSize :: Int shaSize = 40 {- Commits the index into the specified branch (or other ref), - - with the specified parent refs. -} -commit :: String -> Ref -> [Ref] -> Repo -> IO () + - with the specified parent refs, and returns the new ref -} +commit :: String -> Ref -> [Ref] -> Repo -> IO Ref commit message newref parentrefs repo = do tree <- getSha "write-tree" $ asString $ pipeRead [Param "write-tree"] repo @@ -473,6 +473,7 @@ commit message newref parentrefs repo = do (map Param $ ["commit-tree", show tree] ++ ps) (L.pack message) repo run "update-ref" [Param $ show newref, Param $ show sha] repo + return sha where ignorehandle a = snd <$> a asString a = L.unpack <$> a diff --git a/Locations.hs b/Locations.hs index fdb823284..85fcb9888 100644 --- a/Locations.hs +++ b/Locations.hs @@ -21,6 +21,7 @@ module Locations ( gitAnnexJournalDir, gitAnnexJournalLock, gitAnnexIndex, + gitAnnexIndexLock, isLinkToAnnex, annexHashes, hashDirMixed, @@ -136,6 +137,10 @@ gitAnnexJournalLock r = gitAnnexDir r "journal.lck" gitAnnexIndex :: Git.Repo -> FilePath gitAnnexIndex r = gitAnnexDir r "index" +{- Lock file for .git/annex/index. -} +gitAnnexIndexLock :: Git.Repo -> FilePath +gitAnnexIndexLock r = gitAnnexDir r "index.lck" + {- Checks a symlink target to see if it appears to point to annexed content. -} isLinkToAnnex :: FilePath -> Bool isLinkToAnnex s = ("/.git/" ++ objectDir) `isInfixOf` s diff --git a/git-union-merge.hs b/git-union-merge.hs index 1cec4a0f8..edd9330c8 100644 --- a/git-union-merge.hs +++ b/git-union-merge.hs @@ -42,5 +42,5 @@ main = do _ <- Git.useIndex (tmpIndex g) setup g Git.UnionMerge.merge aref bref g - Git.commit "union merge" newref [aref, bref] g + _ <- Git.commit "union merge" newref [aref, bref] g cleanup g -- cgit v1.2.3 From 0236bb020b7b055676b06d1cdf582746864f3d78 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Sun, 11 Dec 2011 16:19:54 -0400 Subject: update --- doc/bugs/git-annex_branch_corruption.mdwn | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/doc/bugs/git-annex_branch_corruption.mdwn b/doc/bugs/git-annex_branch_corruption.mdwn index 5249e63d8..9c864d85f 100644 --- a/doc/bugs/git-annex_branch_corruption.mdwn +++ b/doc/bugs/git-annex_branch_corruption.mdwn @@ -36,11 +36,14 @@ fixed. Probably git-annex needs to record which ref of the git-annex branch corresponds to its index, and if the branch is at a different ref, -merge it into the index. I am still considering how to do that atomically; -what if a push comes in while git-annex is updating its index? - -> Now git-annex always updates the index with the git-annex branch, which -> is a slow, but safe way to avoid this problem. [[done]] --[[Joey]] +merge it into the index. + +> And now that's [[done]]. I managed to do it with very little slowdown. +> +> A side benefit is that users can now safely check out the git-annex +> branch and commit changes to it, and git-annex will notice them. +> Before, it was documented to ignore such changes. +> --[[Joey]] --- -- cgit v1.2.3 From 81f311103d99ec5bfd31ae5a76d6add05ff40121 Mon Sep 17 00:00:00 2001 From: Joey Hess Date: Sun, 11 Dec 2011 16:41:56 -0400 Subject: a new bug report to track a race --- doc/bugs/git-annex_branch_push_race.mdwn | 43 ++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) create mode 100644 doc/bugs/git-annex_branch_push_race.mdwn diff --git a/doc/bugs/git-annex_branch_push_race.mdwn b/doc/bugs/git-annex_branch_push_race.mdwn new file mode 100644 index 000000000..257c477bf --- /dev/null +++ b/doc/bugs/git-annex_branch_push_race.mdwn @@ -0,0 +1,43 @@ +The fix for the [[git-annex_branch_corruption]] bug is subject to a race. +With that fix, git-annex does this when committing a change to the branch: + +1. lock the journal file (this avoids git-annex racing itself, FWIW) +2. check what the head of the branch points to, to see if a newer branch + has appeared +3. if so, updates the index file from the branch +4. stages changes in the index +5. commits to the branch using the index file + +If a push to the branch comes in during 2-5, then +[[git-annex_branch_corruption]] could still occur. + +--- + +## approach 1, using locking + +Add an update hook and a post-update hook. The update hook +will use locking to ensure that no git-annex is currently running +a commit, and block any git-annex's from starting one. It +will background itself, and remain running during the push. +The post-update hook will signal it to exit. + +I don't like this approach much, since it involves a daemon, two hooks, +and lots of things to go wrong. And it blocks using git-annex during a +push. This approach should be a last resort. + +## approach 2, lockless method + +After a commit is made to the branch, check to see if the parent of +the commit is the same ref that the index file was last updated to. If it's +not, then the race occurred. + +How to recover from the race? Well, just union merging the parent of the +commit into the index file and re-committing should work, I think. When +the race occurs, the commit reverts its parent's changes, and this will +redo them. + +(Of course, this re-commit will also be subject to the race, and +will need the same check for the race as the other commits. It won't loop +forever, I hope.) + +--[[Joey]] -- cgit v1.2.3