| Commit message (Collapse) | Author | Age |
| |
|
|
|
|
| |
This reverts commit c0caa37187e9c062825dd6d5cb6be2dfa63bc7dd.
|
|
|
|
|
|
| |
Thought was that this would be faster than a map, since a vector can be
updated more efficiently. It turns out to not seem to matter; runtime and
memory usage are basically identical.
|
|
|
|
| |
Actually fixed 2 leaks, the tuple leak may have been older.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The control socket path passed to ssh needs to be 17 characters shorter
than the maximum unix domain socket length, because ssh appends stuff to it
to make a temporary filename. Closes: #725512
Also, take the shorter of the relative and the absolute paths to the
socket. Typically the relative path will be a lot shorter (unless
deep inside a subdirectory of the repository), and so using it will
avoid flirting with the maximum safe socket lenghts in more situations,
and so lead to less breakage if all my attempts at fixing this are
still buggy.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Extends the index.lock handling to other git lock files. I surveyed
all lock files used by git, and found more than I expected. All are
handled the same in git; it leaves them open while doing the operation,
possibly writing the new file content to the lock file, and then closes
them when done.
The gc.pid file is excluded because it won't affect the normal operation
of the assistant, and waiting for a gc to finish on startup wouldn't be
good.
All threads except the webapp thread wait on the new startup sanity checker
thread to complete, so they won't try to do things with git that fail
due to stale lock files. The webapp thread mostly avoids doing that kind of
thing itself. A few configurators might fail on lock files, but only if the
user is explicitly trying to run them. The webapp needs to start
immediately when the user has opened it, even if there are stale lock
files.
Arranging for the threads to wait on the startup sanity checker was a bit
of a bear. Have to get all the NotificationHandles set up before the
startup sanity checker runs, or they won't see its signal. Perhaps
the NotificationBroadcaster is not the best interface to have used for
this. Oh well, it works.
This commit was sponsored by Michael Jakl
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
FAT has a lot of characters it does not allow in filenames, like ? and *
It's probably the worst offender, but other filesystems also have
limitiations.
In 2011, I made keyFile escape : to handle FAT, but missed the other
characters. It also turns out that when I did that, I was also living
dangerously; any existing keys that contained a : had their object
location change. Oops.
So, adding new characters to escape to keyFile is out. Well, it would be
possible to make keyFile behave differently on a per-filesystem basis, but
this would be a real nightmare to get right. Consider that a rsync special
remote uses keyFile to determine the filenames to use, and we don't know
the underlying filesystem on the rsync server..
Instead, I have gone for a solution that is backwards compatable and
simple. Its only downside is that already generated URL and WORM keys
might not be able to be stored on FAT or some other filesystem that
dislikes a character used in the key. (In this case, the user can just
migrate the problem keys to a checksumming backend. If this became a big
problem, fsck could be made to detect these and suggest a migration.)
Going forward, new keys that are created will escape all characters that
are likely to cause problems. And if some filesystem comes along that's
even worse than FAT (seems unlikely, but here it is 2013, and people are
still using FAT!), additional characters can be added to the set that are
escaped without difficulty.
(Also, made WORM limit the part of the filename that is embedded in the key,
to deal with filesystem filename length limits. This could have already
been a problem, but is more likely now, since the escaping of the filename
can make it longer.)
This commit was sponsored by Ian Downes
|
|
|
|
| |
Use sanitizeFilePath rather than rolling our own sanitizer.
|
| |
|
| |
|
|
|
|
| |
it so it does not interfere with the automatic commits of changed files.
|
|
|
|
| |
.git/annex/index.lock files, which would prevent git from committing to the git-annex branch, eg after a crash.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SHA3 is still waiting for final standardization.
Although this is looking less likely given
https://www.cdt.org/blogs/joseph-lorenzo-hall/2409-nist-sha-3
In the meantime, cryptohash implements skein, and it's used by some of the
haskell ecosystem (for yesod sessions, IIRC), so this implementation is
likely to continue working. Also, I've talked with the cryprohash author
and he's a reasonable guy.
It makes sense to have an alternate high security hash, in case some
horrible attack is found against SHA2 tomorrow, or in case SHA3 comes out
and worst fears are realized.
I'd also like to support using skein for HMAC. But no hurry there and
a new version of cryptohash has much nicer HMAC code, so I will probably
wait until I can use that version.
|
| |
|
|
|
|
|
|
| |
gcrypt needs to be able to fast-forward the master branch. If a git
repository is set up with git init --shared --bare, it gets that set, and
pushing to it will then fail, even when it's up-to-date.
|
|
|
|
| |
cannot be read.
|
|
|
|
| |
Similar to how a similar problem with indirect was earlier fixed.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sync to or from it
This happened because the transferrer process did not know about the new
remote. remoteFromUUID crashed, which crashed the transferrer. When it was
restarted, the new one knew about the new remote so all further files would
transfer, but the one file would temporarily not be, until transfers retried.
Fixed by making remoteFromUUID not crash, and try reloading the remote list
if it does not know about a remote.
Note that this means that remoteFromUUID does not only return Nothing anymore
when the UUID is the UUID of the local repository. So had to change some code
that dependend on that assumption.
|
|
|
|
|
|
|
|
|
| |
Overridable with --user-agent option.
Not yet done for S3 or WebDAV due to limitations of libraries used --
nether allows a user-agent header to be specified.
This commit sponsored by Michael Zehrer.
|
|
|
|
|
|
| |
Does not yet support re-enabling such a repository though.
This commit was sponsored by Jan Pieper.
|
|
|
|
|
|
|
| |
adding content that gets deduplicated.
Note that this turned out to remove a syscall, not add any expense.
Otherwise, I would not have done it.
|
|
|
|
| |
user running the conversion.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
scan. This prevents repeated retries to download files that are not available, or are not referenced by the current git tree.
This is motivated by a user report that the assistant was repeatedly
retrying transfers of files that had been deleted (in direct mode, so
removing the only copy).
Note that the glacier code retries failed transfers after a while to retry
downloads that have aged long enough to be available. This is ok; if we're
doing a full transfer scan we'll retry on every file that is still in the
git tree.
Also note that this makes the assistant less likely to get every file
referenced by old revs of the git tree. Not something the assistant tries
to ensure anyway, so I feel this is acceptable.
|
|
|
|
|
|
| |
* Note that the layout of gcrypt repositories has changed, and
if you created one you must manually upgrade it.
See http://git-annex.branchable.com/upgrades/gcrypt/
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a massive win on OSX, which doesn't have a sha256sum normally.
Only use external hash commands when the file is > 1 mb,
since cryptohash is quite close to them in speed.
SHA is still used to calculate HMACs. I don't quite understand
cryptohash's API for those.
Used the following benchmark to arrive at the 1 mb number.
1 mb file:
benchmarking sha256/internal
mean: 13.86696 ms, lb 13.83010 ms, ub 13.93453 ms, ci 0.950
std dev: 249.3235 us, lb 162.0448 us, ub 458.1744 us, ci 0.950
found 5 outliers among 100 samples (5.0%)
4 (4.0%) high mild
1 (1.0%) high severe
variance introduced by outliers: 10.415%
variance is moderately inflated by outliers
benchmarking sha256/external
mean: 14.20670 ms, lb 14.17237 ms, ub 14.27004 ms, ci 0.950
std dev: 230.5448 us, lb 150.7310 us, ub 427.6068 us, ci 0.950
found 3 outliers among 100 samples (3.0%)
2 (2.0%) high mild
1 (1.0%) high severe
2 mb file:
benchmarking sha256/internal
mean: 26.44270 ms, lb 26.23701 ms, ub 26.63414 ms, ci 0.950
std dev: 1.012303 ms, lb 925.8921 us, ub 1.122267 ms, ci 0.950
variance introduced by outliers: 35.540%
variance is moderately inflated by outliers
benchmarking sha256/external
mean: 26.84521 ms, lb 26.77644 ms, ub 26.91433 ms, ci 0.950
std dev: 347.7867 us, lb 210.6283 us, ub 571.3351 us, ci 0.950
found 6 outliers among 100 samples (6.0%)
import Crypto.Hash
import Data.ByteString.Lazy as L
import Criterion.Main
import Common
testfile :: FilePath
testfile = "/run/shm/data" -- on ram disk
main = defaultMain
[ bgroup "sha256"
[ bench "internal" $ whnfIO internal
, bench "external" $ whnfIO external
]
]
sha256 :: L.ByteString -> Digest SHA256
sha256 = hashlazy
internal :: IO String
internal = show . sha256 <$> L.readFile testfile
external :: IO String
external = do
s <- readProcess "sha256sum" [testfile]
return $ fst $ separate (== ' ') s
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Done using a mode witness, which ensures it's fixed everywhere.
Fixing catFileKey was a bear, because git cat-file does not provide a
nice way to query for the mode of a file and there is no other efficient
way to do it. Oh, for libgit2..
Note that I am looking at tree objects from HEAD, rather than the index.
Because I cat-file cannot show a tree object for the index.
So this fix is technically incomplete. The only cases where it matters
are:
1. A new large file has been directly staged in git, but not committed.
2. A file that was committed to HEAD as a symlink has been staged
directly in the index.
This could be fixed a lot better using libgit2.
|
|
|
|
| |
from git, which can be so large it runs out of memory.
|
| |
|
|
|
|
|
|
|
|
| |
Now can tell if a repo uses gcrypt or not, and whether it's decryptable
with the current gpg keys.
This closes the hole that undecryptable gcrypt repos could have before been
combined into the repo in encrypted mode.
|
| |
|
|
|
|
|
|
| |
Otherwise gcrypt will fail to pull, since it requires this to be the case.
This needs a patched gcrypt, which is in my forked version.
|
|
|
|
|
| |
No support yet for generating new gpg keys.
No support yet for adding existing encrypted repos from removable drives.
|
|
|
|
| |
Looking up the location log for every key is not the fastest operation..
|
|
|
|
| |
numcopies levels.
|
| |
|
| |
|
|
|
|
| |
merging files into the tree on Windows.
|
|
|
|
| |
represented as standin symlinks on crippled filesystems.
|
|
|
|
|
|
|
|
|
|
|
| |
To support this, a core.gcrypt-id is stored by git-annex inside the git
config of a local gcrypt repository, when setting it up.
That is compared with the remote's cached gcrypt-id. When different, a
drive has been changed. git-annex then looks up the remote config for
the uuid mapped from the core.gcrypt-id, and tweaks the configuration
appropriately. When there is no known config for the uuid, it will refuse to
use the remote.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
files. (Thanks, anarcat for display code and mastensg for inspiration.)
Note that it would be possible to extend the display to show all
repositories. But there can be a lot of repositories that are not set up as
remotes, and it would significantly clutter the display to show them all.
Since we're not showing all repositories, it's not worth trying to show
numcopies count either.
I decided to embrace these limitations and call the command remotes.
|
| |
|
| |
|