summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGravatar Joey Hess <joey@kitenet.net>2012-07-05 16:27:09 -0600
committerGravatar Joey Hess <joey@kitenet.net>2012-07-05 16:27:09 -0600
commit880b55f27705ef86c1b5525a8777382af27befd3 (patch)
tree315deaffe075d9cd4250ec2256ef30c3d030bc35
parent9eaba58dd9706fde7e0fb84364a16576db63a7e0 (diff)
parent5a753a7b8a46c7326b6431dcf5a6eb755534e80d (diff)
Merge branch 'master' into assistant
-rw-r--r--Backend/SHA.hs17
-rw-r--r--debian/changelog2
-rw-r--r--doc/design/assistant/blog/day_25__transfer_queueing.mdwn41
-rw-r--r--doc/design/assistant/syncing.mdwn34
-rw-r--r--doc/forum/Problems_using_submodules_with_git-annex__63__/comment_1_c7a927736d419d3c31c912001ff16ee4._comment7
5 files changed, 63 insertions, 38 deletions
diff --git a/Backend/SHA.hs b/Backend/SHA.hs
index 7abbf8035..95ce4a770 100644
--- a/Backend/SHA.hs
+++ b/Backend/SHA.hs
@@ -97,16 +97,17 @@ keyValueE :: SHASize -> KeySource -> Annex (Maybe Key)
keyValueE size source = keyValue size source >>= maybe (return Nothing) addE
where
addE k = return $ Just $ k
- { keyName = keyName k ++ extension
+ { keyName = keyName k ++ selectExtension (keyFilename source)
, keyBackendName = shaNameE size
}
- naiveextension = takeExtension $ keyFilename source
- extension
- -- long or newline containing extensions are
- -- probably not really an extension
- | length naiveextension > 6 ||
- '\n' `elem` naiveextension = ""
- | otherwise = naiveextension
+
+selectExtension :: FilePath -> String
+selectExtension = join "." . reverse . take 2 . takeWhile shortenough .
+ reverse . split "." . takeExtensions
+ where
+ shortenough e
+ | '\n' `elem` e = False -- newline in extension?!
+ | otherwise = length e <= 4 -- long enough for "jpeg"
{- A key's checksum is checked during fsck. -}
checkKeyChecksum :: SHASize -> Key -> FilePath -> Annex Bool
diff --git a/debian/changelog b/debian/changelog
index 1c44f5952..5eaf9d52e 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -9,6 +9,8 @@ git-annex (3.20120630) UNRELEASED; urgency=low
but avoids portability problems.
* Use SHA library for files less than 50 kb in size, at which point it's
faster than forking the more optimised external program.
+ * SHAnE backends are now smarter about composite extensions, such as
+ .tar.gz Closes: #680450
-- Joey Hess <joeyh@debian.org> Sun, 01 Jul 2012 15:04:37 -0400
diff --git a/doc/design/assistant/blog/day_25__transfer_queueing.mdwn b/doc/design/assistant/blog/day_25__transfer_queueing.mdwn
new file mode 100644
index 000000000..35922c0d1
--- /dev/null
+++ b/doc/design/assistant/blog/day_25__transfer_queueing.mdwn
@@ -0,0 +1,41 @@
+So as not to bury the lead, I've been hard at work on my first day in
+Nicaragua, and ** the git-annex assistant fully syncs files (including
+their contents) between remotes now !! **
+
+Details follow..
+
+Made the committer thread queue Upload Transfers when new files
+are added to the annex. Currently it tries to transfer the new content
+to *every* remote; this innefficiency needs to be addressed later.
+
+Made the watcher thread queue Download Transfers when new symlinks
+appear that point to content we don't have. Typically, that will happen
+after an automatic merge from a remote. This needs to be improved as it
+currently adds Transfers from every remote, not just those that have the
+content.
+
+This was the second place that needed an ordered list of remotes
+to talk to. So I cached such a list in the DaemonStatus state info.
+This will also be handy later on, when the webapp is used to add new
+remotes, so the assistant can know about them immediately.
+
+Added YAT (Yet Another Thread), number 15 or so, the transferrer thread
+that waits for transfers to be queued and runs them. Currently a naive
+implementation, it runs one transfer at a time, and does not do anything
+to recover when a transfer fails.
+
+Actually transferring content requires YAT, so that the transfer
+action can run in a copy of the Annex monad, without blocking
+all the assistant's other threads from entering that monad while a transfer
+is running. This is also necessary to allow multiple concurrent transfers
+to run in the future.
+
+This is a very tricky peice of code, because that thread will modify the
+git-annex branch, and its parent thread has to invalidate its cache in
+order to see any changes the child thread made. Hopefully that's the extent
+of the complication of doing this. The only reason this was possible at all
+is that git-annex already support multiple concurrent processes running
+and all making independant changes to the git-annex branch, etc.
+
+After all my groundwork this week, file content transferring is now
+fully working!
diff --git a/doc/design/assistant/syncing.mdwn b/doc/design/assistant/syncing.mdwn
index 76e2e1832..343a0e4aa 100644
--- a/doc/design/assistant/syncing.mdwn
+++ b/doc/design/assistant/syncing.mdwn
@@ -21,8 +21,11 @@ all the other git clones, at both the git level and the key/value level.
Watcher. **done**
* Write basic Transfer handling thread. Multiple such threads need to be
able to be run at once. Each will need its own independant copy of the
- Annex state monad.
+ Annex state monad. **done**
* Write transfer control thread, which decides when to launch transfers.
+ **done**
+* Check that download transfer triggering code works (when a symlink appears
+ and the remote does *not* upload to us.
* At startup, and possibly periodically, look for files we have that
location tracking indicates remotes do not, and enqueue Uploads for
them. Also, enqueue Downloads for any files we're missing.
@@ -86,35 +89,6 @@ reachable remote. This is worth doing first, since it's the simplest way to
get the basic functionality of the assistant to work. And we'll need this
anyway.
-### transfer tracking
-
-Transfer threads started/stopped as necessary to move data.
-(May sometimes want multiple threads downloading, or uploading, or even both.)
-
- startTransfer :: TransferQueue -> Transfer -> Annex ()
- startTransfer q transfer = error "TODO"
-
- stopTransfer :: TransferQueue -> TransferID -> Annex ()
- stopTransfer q transfer = error "TODO"
-
-The assistant needs to find out when `git-annex-shell` is receiving or
-sending (triggered by another remote), so it can add data for those too.
-This is important to avoid uploading content to a remote that is already
-downloading it from us, or vice versa, as well as to in future let the web
-app manage transfers as user desires.
-
-For files being received, it can see the temp file, but other than lsof
-there's no good way to find the pid (and I'd rather not kill blindly).
-
-For files being sent, there's no filesystem indication. So git-annex-shell
-(and other git-annex transfer processes) should write a status file to disk.
-
-Can use file locking on these status files to claim upload/download rights,
-which will avoid races.
-
-This status file can also be updated periodically to show amount of transfer
-complete (necessary for tracking uploads).
-
## other considerations
This assumes the network is connected. It's often not, so the
diff --git a/doc/forum/Problems_using_submodules_with_git-annex__63__/comment_1_c7a927736d419d3c31c912001ff16ee4._comment b/doc/forum/Problems_using_submodules_with_git-annex__63__/comment_1_c7a927736d419d3c31c912001ff16ee4._comment
new file mode 100644
index 000000000..3c2f5addb
--- /dev/null
+++ b/doc/forum/Problems_using_submodules_with_git-annex__63__/comment_1_c7a927736d419d3c31c912001ff16ee4._comment
@@ -0,0 +1,7 @@
+[[!comment format=mdwn
+ username="http://joeyh.name/"
+ subject="comment 1"
+ date="2012-07-05T17:04:34Z"
+ content="""
+I haven't tried it either, but I think it should work ok, as long as you bear in mind that to git-annex, each submodule will be treated as a separate git repository.
+"""]]