diff options
author | Joey Hess <joeyh@joeyh.name> | 2017-05-10 14:37:11 -0400 |
---|---|---|
committer | Joey Hess <joeyh@joeyh.name> | 2017-05-10 14:37:11 -0400 |
commit | 94f04289d6a5fa542b341819a6623bf8dc2853a0 (patch) | |
tree | 6406d345279a90a6af53449dd1dcd8c986d6aa68 | |
parent | a0c2203baa2b2cdf8240f1c39a2cbb3a6ae39041 (diff) | |
parent | a434f179c85a43a7fd4a6fcbb183f4e8cb974340 (diff) |
Merge branch 'master' of ssh://git-annex.branchable.com
5 files changed, 154 insertions, 0 deletions
diff --git a/doc/forum/Get___39__source__39___group_to_automatically_drop_with_assistant.mdwn b/doc/forum/Get___39__source__39___group_to_automatically_drop_with_assistant.mdwn new file mode 100644 index 000000000..be8768222 --- /dev/null +++ b/doc/forum/Get___39__source__39___group_to_automatically_drop_with_assistant.mdwn @@ -0,0 +1,88 @@ +Hi, + +I'm trying to get my head around groups, wanted, etc. for a particular use case. + +**Problem:** I can't work out how to get a source(?) repository to automatically drop files when they hit a transfer repository. + + +I have a machine (`Machine 1`) that is used for data acquisition but it is behind a strict firewall (both physical and virtual). I usually physically carry a USB drive over, set up a rsync ssh -> local-USB-drive from the one machine (`Machine 2`) that is able to connect over the network to `Machine 1`. As it is a pain to lug the drive over, I only do this rsync maybe weekly, so the rsync takes many hours (~24) to complete. Then (when I remember) I visit and I carry the USB drive back... Naturally, this slows down my work process. + +What I was hoping to do was set up git-annex with the assistant to help me. I am able to run the assistant, but not the webapp on `Machines 1 and 2`. :-( + +My thought was - as these have to be disconnected network transfers... + +- `Repository 1 -> Repository 2` (when space permits) +- `Repository 2 -> Repository 3` (when space permits) `-> Repository 4` (USB drive(s)) + + +Another limitation is that `Repos/Machines 2 & 3` have limited storage space. + + +As a test case I can set up (`Repo1 -> Repo2`) and (`Repo2 -> Repo3`) (on other machines, but the commands should be the same...) + +After reading a bit I made a changed [preferred content](/preferred_content/standard_groups/) for a transfer repo to: + +``` +not (inallgroup=client and copies=client:1) and ($client) +``` + +i.e. `copies` from `2` to `1`. + +--- + + +Finally...The question +---------------------- + +**BUT** I can't work out how to get `Repo1` (the source) to automatically drop the files when they hit `Repo2` (what I'm guessing should be a transfer repository). + +Can anyone suggest how to automagically do this with the assistant? + + +--- + + +If it would help I can share the git-annex commands I've been using, but as I'm only doing testing up at the moment, I'm happy to start from scratch if there is a RTFM page out there. :-) + + +I've put some details about my thoughts on the repositories and restrictions below. + + +Thanks - Olaf + + + + +Repository 1 +------------ +- Type: source (Data collection) +- Human readable directory structure +- Physically: Machine 1 +- Strict firewall only incoming network connections from Machine 2 +- Storage: 50Gb + + +Repository 2 +------------ +- Type: transfer +- Physically: Machine 2 +- Reasonably relaxed firewall, can talk to Repository 3 +- Limited storage: 10Gb + + +Repository 3 +------------ +- Type: transfer +- Pysically: Machine 3 +- Reasonably relaxed firewall, can talk to Repository 2 +- Limited storage: 10Gb +- Connected to USB drive(s) + + +Repository 4, 5, ... +-------------------- +- Type: ? Client ? +- Human readable directory structure +- Physically: USB drive +- Usually (but not always) connected to machine 3 +- Large storage (2Tb) + Additional drives diff --git a/doc/forum/Lots_of_4k_symlinks/comment_4_be12d26936b3502445e880be997b8877._comment b/doc/forum/Lots_of_4k_symlinks/comment_4_be12d26936b3502445e880be997b8877._comment new file mode 100644 index 000000000..8737c0c1b --- /dev/null +++ b/doc/forum/Lots_of_4k_symlinks/comment_4_be12d26936b3502445e880be997b8877._comment @@ -0,0 +1,14 @@ +[[!comment format=mdwn + username="CandyAngel" + avatar="http://cdn.libravatar.org/avatar/15c0aade8bec5bf004f939dd73cf9ed8" + subject="comment 4" + date="2017-05-10T09:21:34Z" + content=""" +> And I doubt CandyAngel was counting only the sizes of symlinks and not git repos or at least directory inodes to hold all the symlinks.) + +In that repository, it is only top level directories (no sub directories) and each directory in it only has symlinks (up to 8000 of them). Directories are **mkdir $(uuidgen -r)**, hence the wildcard for du. + +It would be including the directory size to hold all the inodes, but it definitely *isn't counting .git* as this annex spans 3 drives with 6TB of content so far. Well, 6 drives because of \"numcopies 2\" :P + +I will calculate this a different way and only count symlinks, when I have access to it again. +"""]] diff --git a/doc/forum/Lots_of_4k_symlinks/comment_5_ab3884748f3271a77ba3320f25d74414._comment b/doc/forum/Lots_of_4k_symlinks/comment_5_ab3884748f3271a77ba3320f25d74414._comment new file mode 100644 index 000000000..c1f428975 --- /dev/null +++ b/doc/forum/Lots_of_4k_symlinks/comment_5_ab3884748f3271a77ba3320f25d74414._comment @@ -0,0 +1,25 @@ +[[!comment format=mdwn + username="CandyAngel" + avatar="http://cdn.libravatar.org/avatar/15c0aade8bec5bf004f939dd73cf9ed8" + subject="comment 5" + date="2017-05-10T12:44:08Z" + content=""" + $ find -name .git -prune -o -type l | wc -l + 1034886 + +Just over a million symlinks.. very convenient :) + + $ find -name .git -prune -o -type l -printf '%s\n' | awk '{sum+=$1} END {print sum/1024**3}' + 195.9 # 195MB actual size + $ find -name .git -prune -o -type l -print0 | du -ch --files0-from=- | tail -n1 + 4.0G total # 4GB disk usage + +And in comparison to my earlier comment 2 weeks ago: + + $ du -shc *-* | tail -n3 + 33M fd79bbd4-d41e-4ea8-acc8-86437c5eed7c + 33M ffbd042e-f6d9-4450-9a57-8ed1086f587c + 4.1G total + +So directory inode sizes are dwarfed by the 4K disk usage but ~198b actual usage of the symlinks (~96% wasted space?). +"""]] diff --git a/doc/forum/Lots_of_4k_symlinks/comment_6_fa72056ee788ac0f2c15bb57d9876cf6._comment b/doc/forum/Lots_of_4k_symlinks/comment_6_fa72056ee788ac0f2c15bb57d9876cf6._comment new file mode 100644 index 000000000..6e4c89631 --- /dev/null +++ b/doc/forum/Lots_of_4k_symlinks/comment_6_fa72056ee788ac0f2c15bb57d9876cf6._comment @@ -0,0 +1,16 @@ +[[!comment format=mdwn + username="CandyAngel" + avatar="http://cdn.libravatar.org/avatar/15c0aade8bec5bf004f939dd73cf9ed8" + subject="comment 6" + date="2017-05-10T12:45:59Z" + content=""" +Oops, + + find -name .git -prune -o -type l -printf '%s\n' | awk '{sum+=$1} END {print sum/1024**3}' + +should have been + + find -name .git -prune -o -type l -printf '%s\n' | awk '{sum+=$1} END {print sum/1024**2}' + +That'll teach me to prematurely copy it :P +"""]] diff --git a/doc/forum/sneakernet_with_a___34__directory__34___special_remote_on_fat/comment_3_02eae41ba9ad2f2fb15cbd20069bf1da._comment b/doc/forum/sneakernet_with_a___34__directory__34___special_remote_on_fat/comment_3_02eae41ba9ad2f2fb15cbd20069bf1da._comment new file mode 100644 index 000000000..f9803584e --- /dev/null +++ b/doc/forum/sneakernet_with_a___34__directory__34___special_remote_on_fat/comment_3_02eae41ba9ad2f2fb15cbd20069bf1da._comment @@ -0,0 +1,11 @@ +[[!comment format=mdwn + username="https://launchpad.net/~barthelemy" + nickname="barthelemy" + avatar="http://cdn.libravatar.org/avatar/e99cb15f6029de3225721b3ebdd0233905eb69698e9b229a8c4cc510a4135438" + subject="comment 3" + date="2017-05-09T23:38:27Z" + content=""" +Hi Joel, +thank you for the precision (and for git annex, and for all the rest!) +Cheers +"""]] |