diff options
Diffstat (limited to 'doc')
16 files changed, 276 insertions, 0 deletions
diff --git a/doc/backends/comment_7_4aa8cfaec1090f79fed530720e4ddad4._comment b/doc/backends/comment_7_4aa8cfaec1090f79fed530720e4ddad4._comment new file mode 100644 index 000000000..3028547fe --- /dev/null +++ b/doc/backends/comment_7_4aa8cfaec1090f79fed530720e4ddad4._comment @@ -0,0 +1,18 @@ +[[!comment format=mdwn + username="https://www.google.com/accounts/o8/id?id=AItOawmraN_ldJplGunVGmnjjLN6jL9s9TrVMGE" + nickname="Ævar Arnfjörð" + subject="Can annex use existing backends when amending existing files?" + date="2014-08-05T21:35:34Z" + content=""" +Related to the question posed in http://git-annex.branchable.com/forum/switching_backends/ can git annex be told to use the existing backend for a given file? + +The use case for this is that you have an existing repo that started out e.g. with SHA256, but new files are being added with SHA256E since that's the default now. + +But I was doing: + + git annex edit . + rsync /some/old/copy/ . + git annex add . + +And was expecting it to show no changes for existing files, but it did, it would be nice if that was not the case. +"""]] diff --git a/doc/bugs/S3_upload_not_using_multipart/comment_6_906abafc53070d8e4f33df486d2241ea._comment b/doc/bugs/S3_upload_not_using_multipart/comment_6_906abafc53070d8e4f33df486d2241ea._comment new file mode 100644 index 000000000..ad9d4b601 --- /dev/null +++ b/doc/bugs/S3_upload_not_using_multipart/comment_6_906abafc53070d8e4f33df486d2241ea._comment @@ -0,0 +1,12 @@ +[[!comment format=mdwn + username="http://svario.it/gioele" + nickname="gioele" + subject="Multipart S3 support files > 5 GB" + date="2014-08-04T06:00:45Z" + content=""" +The [multipart guide](http://docs.aws.amazon.com/AmazonS3/latest/dev/UploadingObjects.html) says that the limit is 5 TB per file. + +> **Upload objects in parts—Using the Multipart upload API you can upload large objects, up to 5 TB.** + +> The Multipart Upload API is designed to improve the upload experience for larger objects. You can upload objects in parts. These object parts can be uploaded independently, in any order, and in parallel. You can use a Multipart Upload for objects from 5 MB to 5 TB in size. For more information, see Uploading Objects Using Multipart Upload. For more information, see Uploading Objects Using Multipart Upload API. +"""]] diff --git a/doc/bugs/file_in_manual_mode_repository_is_dropped_when_it_is_copied_to_another_manual_mode_repository.mdwn b/doc/bugs/file_in_manual_mode_repository_is_dropped_when_it_is_copied_to_another_manual_mode_repository.mdwn new file mode 100644 index 000000000..454c9f038 --- /dev/null +++ b/doc/bugs/file_in_manual_mode_repository_is_dropped_when_it_is_copied_to_another_manual_mode_repository.mdwn @@ -0,0 +1,35 @@ +Hi, + +### Please describe the problem. +I have a distant repository in backup mode acting as origin and 2 computers at home containing that repository in manual mode. Everything is in indirect mode and watched by git annex assistant. +If I get a file on one local computer from origin, everything act as expected, I have a copy on that file on origin and on the local computer. +But, if I then want to copy the file over to the other local computer by issuing a git annex get on the second computer with a --from option specifying the first local computer, the file is transferred but when the transfer is completed the file is dropped on the first local computer. + +If I try and retrieve the file from origin, I don't have that problem and the file is kept on every repository I issued the git annex get command. + +A bigger problem is that if, still on the second local computer, I start transferring a file from the first local computer and I interrupt the rsync process, the file is still dropped by the first local computer although it is not yet on the second. +It is still on the origin so no data is lost but I don't think this behavior is really intended. + +The version of git-annex I use is as follows: + +On the two local computers (from the ArchLinux aur): +git-annex version: 5.20140716-g8c14ba8 +build flags: Assistant Webapp Webapp-secure Pairing Testsuite S3 WebDAV Inotify DBus DesktopNotify XMPP DNS Feeds Quvi TDFA CryptoHash +key/value backends: SHA256E SHA1E SHA512E SHA224E SHA384E SKEIN256E SKEIN512E SHA256 SHA1 SHA512 SHA224 SHA384 SKEIN256 SKEIN512 WORM URL +remote types: git gcrypt S3 bup directory rsync web webdav tahoe glacier ddar hook external +local repository version: 5 +supported repository version: 5 +upgrade supported from repository versions: 0 1 2 4 + +On the server (debian stable, standalone git-annex archive): +git-annex version: 5.20140716-g8c14ba8 +build flags: Assistant Webapp Webapp-secure Pairing Testsuite S3 WebDAV Inotify DBus DesktopNotify XMPP DNS Feeds Quvi TDFA CryptoHash +key/value backends: SHA256E SHA1E SHA512E SHA224E SHA384E SKEIN256E SKEIN512E SHA256 SHA1 SHA512 SHA224 SHA384 SKEIN256 SKEIN512 WORM URL +remote types: git gcrypt S3 bup directory rsync web webdav tahoe glacier ddar hook external +local repository version: 5 +supported repository version: 5 +upgrade supported from repository versions: 0 1 2 4 + +It should be the same version everywhere. + +Please ask me if you need more information. diff --git a/doc/bugs/git_annex_repair_fails_-___47__tmp__47__tmprepo.1__47__.git__47__gc.pid:_removeLink:_does_not_exist___40__No_such_file_or_directory__41__.mdwn b/doc/bugs/git_annex_repair_fails_-___47__tmp__47__tmprepo.1__47__.git__47__gc.pid:_removeLink:_does_not_exist___40__No_such_file_or_directory__41__.mdwn new file mode 100644 index 000000000..5407db36a --- /dev/null +++ b/doc/bugs/git_annex_repair_fails_-___47__tmp__47__tmprepo.1__47__.git__47__gc.pid:_removeLink:_does_not_exist___40__No_such_file_or_directory__41__.mdwn @@ -0,0 +1,38 @@ +### Please describe the problem. +git annex repair fails to repair the repo and reports this error: + + git-annex: /tmp/tmprepo.2/.git/gc.pid: removeLink: does not exist (No such file or directory) + failed + git-annex: repair: 1 failed + + +### What steps will reproduce the problem? + + +### What version of git-annex are you using? On what operating system? +git-annex-5.20140717 on gentoo with use flags: assistant cryptohash dbus desktop-notify dns feed inotify pairing production quvi s3 tahoe tdfa testsuite webapp webapp-secure webdav xmpp + +### Please provide any additional information below. + +[[!format sh """ +~/annex $ git annex repair +Running git fsck ... +Unpacking all pack files. +Unpacking objects: 100% (630/630), done. +Unpacking objects: 100% (630/630), done. +Unpacking objects: 100% (638/638), done. +Initialized empty Git repository in /tmp/tmprepo.2/.git/ +Trying to recover missing objects from remote 192.168.1.246_annex. +Unpacking all pack files. +Unpacking objects: 100% (210608/210608), done. +Trying to recover missing objects from remote 192.168.1.246_annex. +Auto packing the repository in background for optimum performance. +See "git help gc" for manual housekeeping. + +git-annex: /tmp/tmprepo.2/.git/gc.pid: removeLink: does not exist (No such file or directory) +failed +git-annex: repair: 1 failed + + +# End of transcript or log. +"""]] diff --git a/doc/devblog/day_210__conversion_and_digression.mdwn b/doc/devblog/day_210__conversion_and_digression.mdwn new file mode 100644 index 000000000..2629eb917 --- /dev/null +++ b/doc/devblog/day_210__conversion_and_digression.mdwn @@ -0,0 +1,14 @@ +Just finished converting both rsync and gcrypt to the new API, +and testing them. Still need to fix 2 test suite failures for gcrypt. +Otherwise, only WebDAV remains unconverted. + +Earlier today, I investigated switching from hS3 to +<http://hackage.haskell.org/package/aws>. Learned its API, which seemed a +lot easier to comprehend than the other two times I looked at it. Wrote +some test programs, which are in the `s3-aws` branch. I was able to stream +in large files to S3, without ever buffering them in memory (which hS3's +API precludes). And for chunking, it can reuse an http connection. +This seems very promising. (Also, it might eventually get Glacier support..) + +I have uploaded haskell-aws to Debian, and once it gets into testing and +backports, I plan to switch git-annex over to it. diff --git a/doc/devblog/day_211__conversion_complete.mdwn b/doc/devblog/day_211__conversion_complete.mdwn new file mode 100644 index 000000000..5e172fae1 --- /dev/null +++ b/doc/devblog/day_211__conversion_complete.mdwn @@ -0,0 +1,12 @@ +Converted the webdav special remote to the new API. +All done with converting everything now! + +I also updated the new API to support doing things like +reusing the same http connection when removing and checking +the presence of chunks. + +I've been working on improving the haskell DAV library, in a +number of ways that will let me improve the webdav special remote. +Including making changes that will let me do connection caching, and +improving its API to support streaming content without buffering a whole +file in memory. diff --git a/doc/devblog/day_212__webdav_rewrite.mdwn b/doc/devblog/day_212__webdav_rewrite.mdwn new file mode 100644 index 000000000..714800c9a --- /dev/null +++ b/doc/devblog/day_212__webdav_rewrite.mdwn @@ -0,0 +1,18 @@ +Today was spent reworking so much of the webdav special remote that it was +essentially rewritten from scratch. + +The main improvement is that it now keeps a http connection open and uses +it to perform multiple actions. Before, one connection was made per action. +This is even done for operations on chunks. So, now storing a chunked file +in webdav makes only 2 http connections total. Before, it would take around +10 connections *per chunk*. So a big win for performance, although there is +still room for improvement: It would be possible to reduce that down to +just 1 connection, and indeed keep a persistent connection reused when +acting on multiple files. + +Finished up by making uploading a large (non-chunked) file to webdav not +buffer the whole file in memory. + +I still need to make downloading a file from webdav not buffer it, and +test, and then I'll be done with webdav and can move on to making +similar changes to S3. diff --git a/doc/devblog/day_213__newchunks_merged.mdwn b/doc/devblog/day_213__newchunks_merged.mdwn new file mode 100644 index 000000000..42ea7a655 --- /dev/null +++ b/doc/devblog/day_213__newchunks_merged.mdwn @@ -0,0 +1,15 @@ +Finished up webdav, and after running `testremote` for a long time, I'm +satisfied it's good. The newchunks branch has now been merged into master +completely. + +Spent the rest of the day beginning to rework the S3 special remote to use +the aws library. This was pretty fiddly; I want to keep all the +configuration exactly the same, so had to do a lot of mapping from hS3 +configuration to aws configuration. Also there is some hairy stuff +involving escaping from the ResourceT monad with responses and http +connection managers intact. + +Stopped once `initremote` worked. The rest should be pretty easy, although +Internet Archive support is blocked by +<https://github.com/aristidb/aws/issues/119>. This is in the `s3-aws` +branch until it gets usable. diff --git a/doc/forum/Duplicate_entries_in_location_tracking_logs/comment_1_3afb76397519b8ca8b55958a344f1871._comment b/doc/forum/Duplicate_entries_in_location_tracking_logs/comment_1_3afb76397519b8ca8b55958a344f1871._comment new file mode 100644 index 000000000..cc159bd30 --- /dev/null +++ b/doc/forum/Duplicate_entries_in_location_tracking_logs/comment_1_3afb76397519b8ca8b55958a344f1871._comment @@ -0,0 +1,10 @@ +[[!comment format=mdwn + username="http://joeyh.name/" + ip="209.250.56.112" + subject="comment 1" + date="2014-08-03T18:59:58Z" + content=""" +This is perfectly normal. The next time that file in the git-annex branch is updated for any reason, git-annex will automatically compress the two entries down to a single one. In the meantime, it has no difficulty working out which entry is more recent. This is basically why it's called a log file. ;) + +It would be possible to make the union merge code compress as it merges, but this would slow down union merging some, and make it a more conceptually complicated operation. Also, whether the old entry is present in the file or not, git will be storing a copy of that old entry, so it doesn't actually tend to make the git repository any larger. For more on this, see <https://joeyh.name/blog/entry/databranches/> +"""]] diff --git a/doc/forum/Duplicate_entries_in_location_tracking_logs/comment_2_6f327444772ee1e660a12e7442162df5._comment b/doc/forum/Duplicate_entries_in_location_tracking_logs/comment_2_6f327444772ee1e660a12e7442162df5._comment new file mode 100644 index 000000000..6cce29757 --- /dev/null +++ b/doc/forum/Duplicate_entries_in_location_tracking_logs/comment_2_6f327444772ee1e660a12e7442162df5._comment @@ -0,0 +1,10 @@ +[[!comment format=mdwn + username="zardoz" + ip="134.147.14.84" + subject="comment 2" + date="2014-08-04T08:05:37Z" + content=""" +Thanks for the info, Joey! As long as the git tracks the history anyway, this should not increase space consumption that much. + +Perhaps it would be useful to have something like «git annex gc» that can clean up these things manually in some situations, e. g. to compact everything before doing a «git annex forget». +"""]] diff --git a/doc/forum/How_to_get_detailed_information_on_special_remotes__63__.mdwn b/doc/forum/How_to_get_detailed_information_on_special_remotes__63__.mdwn new file mode 100644 index 000000000..415b7570e --- /dev/null +++ b/doc/forum/How_to_get_detailed_information_on_special_remotes__63__.mdwn @@ -0,0 +1 @@ +How can I get information on already-configured special remotes in a repository? I'd like to find out what type a given special remote is, and which URL/bucket, encryption mode, etc. was used to set it up. diff --git a/doc/forum/drop__47__whereis_not_showing_gcrypted_special_ssh_remote.mdwn b/doc/forum/drop__47__whereis_not_showing_gcrypted_special_ssh_remote.mdwn new file mode 100644 index 000000000..9445539b4 --- /dev/null +++ b/doc/forum/drop__47__whereis_not_showing_gcrypted_special_ssh_remote.mdwn @@ -0,0 +1,11 @@ +I have my laptop, my server and my usb drive. My server is a gcrypted remote via ssh. My laptop is just a repo that's referenced as a remote with a filepath. + +I do git annex copy --to server, and let it copy stuff. I repeat the same thing for the usb drive. + +I run git annex sync and it does the whole sync dance successfully. it pushes stuff to both the usb drive and the server. + +Afterwards, I do git annex whereis, and I only get 2 copies showing - my laptop and my usb drive. Likewise, since I set numcopies to 2, it won't let me drop anything at all, because it doesn't know there's a copy on my server. + +Anything I can do about this? (What further info do you need?) + +I should probably add that my server is a ubuntu machine, and so it runs version 5.20140412ubuntu1. My laptop runs a more recent 5.20140717-g5a7d4ff. diff --git a/doc/forum/drop__47__whereis_not_showing_gcrypted_special_ssh_remote/comment_1_968bc2be595008790e9b93d82342714c._comment b/doc/forum/drop__47__whereis_not_showing_gcrypted_special_ssh_remote/comment_1_968bc2be595008790e9b93d82342714c._comment new file mode 100644 index 000000000..45699f443 --- /dev/null +++ b/doc/forum/drop__47__whereis_not_showing_gcrypted_special_ssh_remote/comment_1_968bc2be595008790e9b93d82342714c._comment @@ -0,0 +1,10 @@ +[[!comment format=mdwn + username="annexuser123" + ip="88.65.52.206" + subject="comment 1" + date="2014-08-04T18:00:00Z" + content=""" +To add further to this, the gcrypted remote does not show up in either the usb drive's git annex info, nor the laptop's git annex info. + +Despite that, I can push to it and copy to it. +"""]] diff --git a/doc/tips/Bup_repositories_in_git-annex.mdwn b/doc/tips/Bup_repositories_in_git-annex.mdwn new file mode 100644 index 000000000..542dc2f58 --- /dev/null +++ b/doc/tips/Bup_repositories_in_git-annex.mdwn @@ -0,0 +1,51 @@ +I'd like to share my setup for keeping [bup](https://github.com/bup/bup/) repositories in git-annex.¹ +I'm not sure if this is a *good* tip, so comments are welcome. + +The purpose of this setup is to (kind of) bring encryption to bup, +and make it easy to keep bup backups in untrusted storage by making use of the encryption modes and backends provided by git-annex. +This approach can be used to make encrypted *backups of bup repositories*; +it can not replace encrypted filesystems such as EncFS or S3QL +which wouldn't necessarily require local bup repositories but also can't be combined with storage like Amazon Glacier. + +To add a bup repository to git-annex, initialize a regular indirect git-annex repository, +and make the bup repository a subdirectory of it.² +Then `git annex add $BUP_REPO/objects/packs`, i.e. the location of the large data files (.pack & .par2). +The rest of the bup repository should be tracked by Git (`git add $BUP_REPO`).³ +This way the repository stays fully functional. + +After a bup-save the following steps will synchronize all remotes:⁴ + + git annex add $BUP_REPO/objects/pack + git add $BUP_REPO + git commit -m "Backup on $(date)" + git annex sync --content + +In my current setup, the git-annex repositories are located on a local file server. +Various clients use bup to create backups on the server. +This server also makes backups of other servers. +Afterwards, it uploads the annexed data to Glacier +(via an [encrypted S3 special remote](/special_remotes/S3/)), +and pushes the small Git repositories to an S3QL filesystem and another off-site server. +Using these repositories (and my GPG key) the bup repositories could be recovered. + +It may be important to note that in order to be able to *access* a bup repository, +*all* files have to be available locally. +Bup will not function if any pack files are missing (maybe this can be improved?). + +----- + +¹) Not to be confused with git-annex's [bup special remote](/special_remotes/bup/). + +²) You can't initialize git-annex repositories directly inside bup repositories +because git-annex will (rightfully) identify them as bare git repositories and set itself up accordingly. + +³) I've come up with these .gitignore rules to exclude potentially large files not needed for recovery: + + /bup_repo/bupindex* + /bup_repo/objects/pack/bup.bloom + /bup_repo/objects/pack/midx*midx + /bup_repo/objects/tmp*.pack + /bup_repo/index-cache/ + +⁴) `git annex sync` might not be the safest command to use because it would merge changes from the remotes. +However, assuming normal bup usage, external changes to the bup repository are not to be expected. diff --git a/doc/tips/flickrannex/comment_15_2dd75800e4db58761fcbbd1954a36f1f._comment b/doc/tips/flickrannex/comment_15_2dd75800e4db58761fcbbd1954a36f1f._comment new file mode 100644 index 000000000..67f7d3b74 --- /dev/null +++ b/doc/tips/flickrannex/comment_15_2dd75800e4db58761fcbbd1954a36f1f._comment @@ -0,0 +1,11 @@ +[[!comment format=mdwn + username="https://www.google.com/accounts/o8/id?id=AItOawmvzzyDA8uXFz8yokeCrepbh8PwWe_WrjE" + nickname="Michael" + subject="current status?" + date="2014-08-08T16:00:18Z" + content=""" +Hi, just wondering what the current status of this plugin is. +The repo at TobiasTheViking looks a little odd - it has a few large commits from January with an unknown author, and the last one appears to completely remove the main script. + +What's going on? does it need a fork? +"""]] diff --git a/doc/tips/flickrannex/comment_16_2f65093ec9f6d67d2cfe5b5fae201123._comment b/doc/tips/flickrannex/comment_16_2f65093ec9f6d67d2cfe5b5fae201123._comment new file mode 100644 index 000000000..f45aad147 --- /dev/null +++ b/doc/tips/flickrannex/comment_16_2f65093ec9f6d67d2cfe5b5fae201123._comment @@ -0,0 +1,10 @@ +[[!comment format=mdwn + username="https://www.google.com/accounts/o8/id?id=AItOawmvzzyDA8uXFz8yokeCrepbh8PwWe_WrjE" + nickname="Michael" + subject="a more recent fork" + date="2014-08-08T16:03:24Z" + content=""" +Just a note that I poked around on github and saw a more recent cleaned up version in this fork: https://github.com/magthe/flickrannex/tree/devo + + +"""]] |