aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGravatar Joey Hess <joeyh@joeyh.name>2017-05-10 14:37:11 -0400
committerGravatar Joey Hess <joeyh@joeyh.name>2017-05-10 14:37:11 -0400
commit94f04289d6a5fa542b341819a6623bf8dc2853a0 (patch)
tree6406d345279a90a6af53449dd1dcd8c986d6aa68
parenta0c2203baa2b2cdf8240f1c39a2cbb3a6ae39041 (diff)
parenta434f179c85a43a7fd4a6fcbb183f4e8cb974340 (diff)
Merge branch 'master' of ssh://git-annex.branchable.com
-rw-r--r--doc/forum/Get___39__source__39___group_to_automatically_drop_with_assistant.mdwn88
-rw-r--r--doc/forum/Lots_of_4k_symlinks/comment_4_be12d26936b3502445e880be997b8877._comment14
-rw-r--r--doc/forum/Lots_of_4k_symlinks/comment_5_ab3884748f3271a77ba3320f25d74414._comment25
-rw-r--r--doc/forum/Lots_of_4k_symlinks/comment_6_fa72056ee788ac0f2c15bb57d9876cf6._comment16
-rw-r--r--doc/forum/sneakernet_with_a___34__directory__34___special_remote_on_fat/comment_3_02eae41ba9ad2f2fb15cbd20069bf1da._comment11
5 files changed, 154 insertions, 0 deletions
diff --git a/doc/forum/Get___39__source__39___group_to_automatically_drop_with_assistant.mdwn b/doc/forum/Get___39__source__39___group_to_automatically_drop_with_assistant.mdwn
new file mode 100644
index 000000000..be8768222
--- /dev/null
+++ b/doc/forum/Get___39__source__39___group_to_automatically_drop_with_assistant.mdwn
@@ -0,0 +1,88 @@
+Hi,
+
+I'm trying to get my head around groups, wanted, etc. for a particular use case.
+
+**Problem:** I can't work out how to get a source(?) repository to automatically drop files when they hit a transfer repository.
+
+
+I have a machine (`Machine 1`) that is used for data acquisition but it is behind a strict firewall (both physical and virtual). I usually physically carry a USB drive over, set up a rsync ssh -> local-USB-drive from the one machine (`Machine 2`) that is able to connect over the network to `Machine 1`. As it is a pain to lug the drive over, I only do this rsync maybe weekly, so the rsync takes many hours (~24) to complete. Then (when I remember) I visit and I carry the USB drive back... Naturally, this slows down my work process.
+
+What I was hoping to do was set up git-annex with the assistant to help me. I am able to run the assistant, but not the webapp on `Machines 1 and 2`. :-(
+
+My thought was - as these have to be disconnected network transfers...
+
+- `Repository 1 -> Repository 2` (when space permits)
+- `Repository 2 -> Repository 3` (when space permits) `-> Repository 4` (USB drive(s))
+
+
+Another limitation is that `Repos/Machines 2 & 3` have limited storage space.
+
+
+As a test case I can set up (`Repo1 -> Repo2`) and (`Repo2 -> Repo3`) (on other machines, but the commands should be the same...)
+
+After reading a bit I made a changed [preferred content](/preferred_content/standard_groups/) for a transfer repo to:
+
+```
+not (inallgroup=client and copies=client:1) and ($client)
+```
+
+i.e. `copies` from `2` to `1`.
+
+---
+
+
+Finally...The question
+----------------------
+
+**BUT** I can't work out how to get `Repo1` (the source) to automatically drop the files when they hit `Repo2` (what I'm guessing should be a transfer repository).
+
+Can anyone suggest how to automagically do this with the assistant?
+
+
+---
+
+
+If it would help I can share the git-annex commands I've been using, but as I'm only doing testing up at the moment, I'm happy to start from scratch if there is a RTFM page out there. :-)
+
+
+I've put some details about my thoughts on the repositories and restrictions below.
+
+
+Thanks - Olaf
+
+
+
+
+Repository 1
+------------
+- Type: source (Data collection)
+- Human readable directory structure
+- Physically: Machine 1
+- Strict firewall only incoming network connections from Machine 2
+- Storage: 50Gb
+
+
+Repository 2
+------------
+- Type: transfer
+- Physically: Machine 2
+- Reasonably relaxed firewall, can talk to Repository 3
+- Limited storage: 10Gb
+
+
+Repository 3
+------------
+- Type: transfer
+- Pysically: Machine 3
+- Reasonably relaxed firewall, can talk to Repository 2
+- Limited storage: 10Gb
+- Connected to USB drive(s)
+
+
+Repository 4, 5, ...
+--------------------
+- Type: ? Client ?
+- Human readable directory structure
+- Physically: USB drive
+- Usually (but not always) connected to machine 3
+- Large storage (2Tb) + Additional drives
diff --git a/doc/forum/Lots_of_4k_symlinks/comment_4_be12d26936b3502445e880be997b8877._comment b/doc/forum/Lots_of_4k_symlinks/comment_4_be12d26936b3502445e880be997b8877._comment
new file mode 100644
index 000000000..8737c0c1b
--- /dev/null
+++ b/doc/forum/Lots_of_4k_symlinks/comment_4_be12d26936b3502445e880be997b8877._comment
@@ -0,0 +1,14 @@
+[[!comment format=mdwn
+ username="CandyAngel"
+ avatar="http://cdn.libravatar.org/avatar/15c0aade8bec5bf004f939dd73cf9ed8"
+ subject="comment 4"
+ date="2017-05-10T09:21:34Z"
+ content="""
+> And I doubt CandyAngel was counting only the sizes of symlinks and not git repos or at least directory inodes to hold all the symlinks.)
+
+In that repository, it is only top level directories (no sub directories) and each directory in it only has symlinks (up to 8000 of them). Directories are **mkdir $(uuidgen -r)**, hence the wildcard for du.
+
+It would be including the directory size to hold all the inodes, but it definitely *isn't counting .git* as this annex spans 3 drives with 6TB of content so far. Well, 6 drives because of \"numcopies 2\" :P
+
+I will calculate this a different way and only count symlinks, when I have access to it again.
+"""]]
diff --git a/doc/forum/Lots_of_4k_symlinks/comment_5_ab3884748f3271a77ba3320f25d74414._comment b/doc/forum/Lots_of_4k_symlinks/comment_5_ab3884748f3271a77ba3320f25d74414._comment
new file mode 100644
index 000000000..c1f428975
--- /dev/null
+++ b/doc/forum/Lots_of_4k_symlinks/comment_5_ab3884748f3271a77ba3320f25d74414._comment
@@ -0,0 +1,25 @@
+[[!comment format=mdwn
+ username="CandyAngel"
+ avatar="http://cdn.libravatar.org/avatar/15c0aade8bec5bf004f939dd73cf9ed8"
+ subject="comment 5"
+ date="2017-05-10T12:44:08Z"
+ content="""
+ $ find -name .git -prune -o -type l | wc -l
+ 1034886
+
+Just over a million symlinks.. very convenient :)
+
+ $ find -name .git -prune -o -type l -printf '%s\n' | awk '{sum+=$1} END {print sum/1024**3}'
+ 195.9 # 195MB actual size
+ $ find -name .git -prune -o -type l -print0 | du -ch --files0-from=- | tail -n1
+ 4.0G total # 4GB disk usage
+
+And in comparison to my earlier comment 2 weeks ago:
+
+ $ du -shc *-* | tail -n3
+ 33M fd79bbd4-d41e-4ea8-acc8-86437c5eed7c
+ 33M ffbd042e-f6d9-4450-9a57-8ed1086f587c
+ 4.1G total
+
+So directory inode sizes are dwarfed by the 4K disk usage but ~198b actual usage of the symlinks (~96% wasted space?).
+"""]]
diff --git a/doc/forum/Lots_of_4k_symlinks/comment_6_fa72056ee788ac0f2c15bb57d9876cf6._comment b/doc/forum/Lots_of_4k_symlinks/comment_6_fa72056ee788ac0f2c15bb57d9876cf6._comment
new file mode 100644
index 000000000..6e4c89631
--- /dev/null
+++ b/doc/forum/Lots_of_4k_symlinks/comment_6_fa72056ee788ac0f2c15bb57d9876cf6._comment
@@ -0,0 +1,16 @@
+[[!comment format=mdwn
+ username="CandyAngel"
+ avatar="http://cdn.libravatar.org/avatar/15c0aade8bec5bf004f939dd73cf9ed8"
+ subject="comment 6"
+ date="2017-05-10T12:45:59Z"
+ content="""
+Oops,
+
+ find -name .git -prune -o -type l -printf '%s\n' | awk '{sum+=$1} END {print sum/1024**3}'
+
+should have been
+
+ find -name .git -prune -o -type l -printf '%s\n' | awk '{sum+=$1} END {print sum/1024**2}'
+
+That'll teach me to prematurely copy it :P
+"""]]
diff --git a/doc/forum/sneakernet_with_a___34__directory__34___special_remote_on_fat/comment_3_02eae41ba9ad2f2fb15cbd20069bf1da._comment b/doc/forum/sneakernet_with_a___34__directory__34___special_remote_on_fat/comment_3_02eae41ba9ad2f2fb15cbd20069bf1da._comment
new file mode 100644
index 000000000..f9803584e
--- /dev/null
+++ b/doc/forum/sneakernet_with_a___34__directory__34___special_remote_on_fat/comment_3_02eae41ba9ad2f2fb15cbd20069bf1da._comment
@@ -0,0 +1,11 @@
+[[!comment format=mdwn
+ username="https://launchpad.net/~barthelemy"
+ nickname="barthelemy"
+ avatar="http://cdn.libravatar.org/avatar/e99cb15f6029de3225721b3ebdd0233905eb69698e9b229a8c4cc510a4135438"
+ subject="comment 3"
+ date="2017-05-09T23:38:27Z"
+ content="""
+Hi Joel,
+thank you for the precision (and for git annex, and for all the rest!)
+Cheers
+"""]]