summaryrefslogtreecommitdiff
path: root/doc/tips
diff options
context:
space:
mode:
Diffstat (limited to 'doc/tips')
-rw-r--r--doc/tips/How_to_retroactively_annex_a_file_already_in_a_git_repo/comment_7_603db6818d33663b70b917c04fd8485b._comment30
-rw-r--r--doc/tips/How_to_retroactively_annex_a_file_already_in_a_git_repo/comment_8_834410421ccede5194bd8fbaccea8d1a._comment82
-rw-r--r--doc/tips/a_gui_for_metadata_operations.mdwn13
-rw-r--r--doc/tips/a_gui_for_metadata_operations/comment_1_1ce311d8328ea370a6a3494adea0f5db._comment8
-rw-r--r--doc/tips/peer_to_peer_network_with_tor.mdwn163
-rw-r--r--doc/tips/using_Google_Cloud_Storage/comment_8_1b4eb7e0f44865cd5ff0f8ef507d99c1._comment9
6 files changed, 305 insertions, 0 deletions
diff --git a/doc/tips/How_to_retroactively_annex_a_file_already_in_a_git_repo/comment_7_603db6818d33663b70b917c04fd8485b._comment b/doc/tips/How_to_retroactively_annex_a_file_already_in_a_git_repo/comment_7_603db6818d33663b70b917c04fd8485b._comment
new file mode 100644
index 000000000..5527c2b43
--- /dev/null
+++ b/doc/tips/How_to_retroactively_annex_a_file_already_in_a_git_repo/comment_7_603db6818d33663b70b917c04fd8485b._comment
@@ -0,0 +1,30 @@
+[[!comment format=mdwn
+ username="https://launchpad.net/~stephane-gourichon-lpad"
+ nickname="stephane-gourichon-lpad"
+ avatar="http://cdn.libravatar.org/avatar/02d4a0af59175f9123720b4481d55a769ba954e20f6dd9b2792217d9fa0c6089"
+ subject=""Hmm, guyz? Are you serious with these scripts?" Well, what's the matter?"
+ date="2016-11-15T10:58:32Z"
+ content="""
+## Wow, scary
+
+Dilyin's comment is scary. It suggests bad things can happen, but is not very clear.
+
+Bloated history is one thing.
+Obviously broken repo is bad but can be (slowly) recovered from remotes.
+Subtly crippled history that you don't notice can be a major problem (especially once you have propagated it to all your remotes to \"recover from bloat\").
+
+## More common than it seems
+
+There's a case probably more common than people actually report: mistakenly doing `git add` instead of `git annex add` and realizing it only after a number of commits. Doing `git annex add` at that time will have the file duplicated (regular git and annex).
+
+Extra wish: when doing `git annex add` of a file that is already present in git history, `git-annex` could notice and tell.
+
+## Simple solution?
+
+Can anyone elaborate on the scripts provided here, are they safe? What can happen if improperly used or in corner cases?
+
+* \"files are replaced with symlinks and are in the index\" -> so what ?
+* \"Make sure that you don't have annex.largefiles settings that would prevent annexing the files.\" -> What would happen? Also `.gitattributes`.
+
+Thank you.
+"""]]
diff --git a/doc/tips/How_to_retroactively_annex_a_file_already_in_a_git_repo/comment_8_834410421ccede5194bd8fbaccea8d1a._comment b/doc/tips/How_to_retroactively_annex_a_file_already_in_a_git_repo/comment_8_834410421ccede5194bd8fbaccea8d1a._comment
new file mode 100644
index 000000000..2c36962aa
--- /dev/null
+++ b/doc/tips/How_to_retroactively_annex_a_file_already_in_a_git_repo/comment_8_834410421ccede5194bd8fbaccea8d1a._comment
@@ -0,0 +1,82 @@
+[[!comment format=mdwn
+ username="StephaneGourichon"
+ avatar="http://cdn.libravatar.org/avatar/8cea01af2c7a8bf529d0a3d918ed4abf"
+ subject="Walkthrough of a prudent retroactive annex."
+ date="2016-11-24T11:27:59Z"
+ content="""
+Been using the one-liner. Despite the warning, I'm not dead yet.
+
+There's much more to do than the one-liner.
+
+This post offers instructions.
+
+# First simple try: slow
+
+Was slow (estimated >600s for 189 commits).
+
+# In tmpfs: about 6 times faster
+
+I have cloned repository into /run/user/1000/rewrite-git, which is a tmpfs mount point. (Machine has plenty of RAM.)
+
+There I also did `git annex init`, git-annex found its state branches.
+
+On second try I also did
+
+ git checkout -t remotes/origin/synced/master
+
+So that filter-branch would clean that, too.
+
+There, `filter-branch` operation finished in 90s first try, 149s second try.
+
+`.git/objects` wasn't smaller.
+
+# Practicing reduction on clone
+
+This produced no visible benefit:
+
+time git gc --aggressive
+time git repack -a -d
+
+Even cloning and retrying on clone. Oh, but I should have done `git clone file:///path` as said on git-filter-branch man page's section titled \"CHECKLIST FOR SHRINKING A REPOSITORY\"
+
+This (as seen on https://rtyley.github.io/bfg-repo-cleaner/ ) was efficient:
+
+ git reflog expire --expire=now --all && git gc --prune=now --aggressive
+
+`.git/objects` shrunk from 148M to 58M
+
+All this was on a clone of the repo in tmpfs.
+
+# Propagating cleaned up branches to origin
+
+This confirmed that filter-branch did not change last tree:
+
+ git diff remotes/origin/master..master
+ git diff remotes/origin/synced/master synced/master
+
+This, expectedly, was refused:
+
+ git push origin master
+ git push origin synced/master
+
+On origin, I checked out the hash of current master, then on tmpfs clone
+
+ git push -f origin master
+ git push -f origin synced/master
+
+Looks good.
+
+I'm not doing the aggressive shrink now, because of the \"two orders of magnitude more caution than normal filter-branch\" recommended by arand.
+
+# Now what? Check if precious not broken
+
+I'm planning to do the same operation on the other repos, then :
+
+* if everything seems right,
+* if `git annex sync` works between all those fellows
+* etc,
+* then I would perform the reflog expire, gc prune on some then all of them, etc.
+
+Joey, does this seem okay? Any comment?
+
+"""]]
diff --git a/doc/tips/a_gui_for_metadata_operations.mdwn b/doc/tips/a_gui_for_metadata_operations.mdwn
new file mode 100644
index 000000000..1e1180068
--- /dev/null
+++ b/doc/tips/a_gui_for_metadata_operations.mdwn
@@ -0,0 +1,13 @@
+Hey everyone.
+
+I wrote a GUI for git-annex metadata in Python: [git-annex-metadata-gui](https://github.com/alpernebbi/git-annex-metadata-gui).
+It shows the files that are in the current branch (only those in the annex) in the respective folder hierarchy.
+The keys that are in the repository, but not in the current branch are also shown in another tab.
+You can view, edit or remove fields for individual files with support for multiple values for fields.
+There is a file preview for image and text files as well.
+I uploaded some screenshots in the repository to show it in action.
+
+While making it, I decided to move the git-annex calls into its own Python package,
+which became [git-annex-adapter](https://github.com/alpernebbi/git-annex-adapter).
+
+I hope these can be useful to someone other than myself as well.
diff --git a/doc/tips/a_gui_for_metadata_operations/comment_1_1ce311d8328ea370a6a3494adea0f5db._comment b/doc/tips/a_gui_for_metadata_operations/comment_1_1ce311d8328ea370a6a3494adea0f5db._comment
new file mode 100644
index 000000000..2a55de0be
--- /dev/null
+++ b/doc/tips/a_gui_for_metadata_operations/comment_1_1ce311d8328ea370a6a3494adea0f5db._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 1"""
+ date="2016-12-07T19:58:11Z"
+ content="""
+Thank you for this, I've always wanted such a GUI, and it's been a common
+user request!
+"""]]
diff --git a/doc/tips/peer_to_peer_network_with_tor.mdwn b/doc/tips/peer_to_peer_network_with_tor.mdwn
new file mode 100644
index 000000000..0fdc34625
--- /dev/null
+++ b/doc/tips/peer_to_peer_network_with_tor.mdwn
@@ -0,0 +1,163 @@
+git-annex has recently gotten support for running as a
+[Tor](https://torproject.org/) hidden service. This is a nice secure
+and easy to use way to connect repositories in different
+locations. No account on a central server is needed; it's peer-to-peer.
+
+## dependencies
+
+To use this, you need to get Tor installed and running. See
+[their website](https://torproject.org/), or try a command like:
+
+ sudo apt-get install tor
+
+You also need to install [Magic Wormhole](https://github.com/warner/magic-wormhole).
+
+ sudo apt-get install magic-wormhole
+
+## pairing two repositories
+
+You have two git-annex repositories on different computers, and want to
+connect them together over Tor so they share their contents. Or, you and a
+friend want to connect your repositories together. Pairing is an easy way
+to accomplish this.
+
+In each git-annex repository, run these commands:
+
+ git annex enable-tor
+ git annex remotedaemon
+
+The enable-tor command may prompt for the root password, since it
+configures Tor. Now git-annex is running as a Tor hidden service, but
+it will only talk to peers after pairing with them.
+
+In both repositories, run this command:
+
+ git annex p2p --pair
+
+This will print out a pairing code, like "11-incredible-tumeric",
+and prompt for you to enter the other repository's pairing code.
+
+Once the pairing codes are exchanged, the two repositories will be securely
+connected to one-another via Tor. Each will have a git remote, with a name
+like "peer1", which connects to the other repository.
+
+Then, you can run commands like `git annex sync peer1 --content` to sync
+with the paired repository.
+
+Pairing connects just two repositories, but you can repeat the process to
+pair with as many other repositories as you like, in order to build up
+larger networks of repositories.
+
+## how to exchange pairing codes
+
+When pairing with a friend's repository, you have to exchange
+pairing codes. How to do this securely?
+
+The pairing codes can only be used once, so it's ok to exchange them in
+a way that someone else can access later. However, if someone can overhear
+your exchange of codes in real time, they could trick you into pairing
+with them.
+
+Here are some suggestions for how to exchange the codes,
+with the most secure ways first:
+
+* In person.
+* In an encrypted message (gpg signed email, Off The Record (OTR)
+ conversation, etc).
+* By a voice phone call.
+
+## starting git-annex remotedaemon on boot
+
+Notice the `git annex remotedaemon` being run in the above examples.
+That command runs the Tor hidden service so that other peers
+can connect to your repository over Tor.
+
+So, you may want to arrange for the remotedaemon to be started on boot.
+You can do that with a simple cron job:
+
+ @reboot cd ~/myannexrepo && git annex remotedaemon
+
+If you use the git-annex assistant, and have it auto-starting on boot, it
+will take care of starting the remotedaemon for you.
+
+## speed of large transfers
+
+Tor prioritizes security over speed, and the Tor network only has so much
+bandwidth to go around. So, distributing large quantities (gigabytes)
+of data over Tor may be slow, and should probably be avoided.
+
+One way to avoid sending much data over tor is to set up an encrypted
+[[special_remote|special_remotes]] someplace. git-annex knows that Tor is
+rather expensive to use, so if a file is available on a special remote as
+well as over Tor, it will download it from the special remote.
+
+You can contribute to the Tor network by
+[running a Tor relay or bridge](https://www.torproject.org/getinvolved/relays.html.en).
+
+## onion addresses and authentication
+
+You don't need to know about this, but it might be helpful to understand
+how it works.
+
+git-annex's Tor support uses onion address as the address of a git remote.
+You can `git pull`, push, etc with those onion addresses:
+
+ git pull tor-annnex::eeaytkuhaupbarfi.onion:4412
+ git remote add peer1 tor-annnex::eeaytkuhaupbarfi.onion:4412
+
+Onion addresses are semi-public. When you add a remote, they appear in your
+`.git/config` file. For security, there's a second level of authentication
+that git-annex uses to make sure that only people you want to can access
+your repository over Tor. That takes the form of a long string of numbers
+and letters, like "7f53c5b65b8957ef626fd461ceaae8056e3dbc459ae715e4".
+
+The addresses generated by `git annex peer --gen-addresses`
+combine the onion address with the authentication data.
+
+When you run `git annex peer --link`, it sets up a git remote using
+the onion address, and it stashes the authentication data away in a file in
+`.git/annex/creds/`
+
+When you pair repositories, these addresses are exchanged using
+[Magic Wormhole](https://github.com/warner/magic-wormhole).
+
+## security
+
+Tor hidden services can be quite secure. But this doesn't mean that using
+git-annex over Tor is automatically perfectly secure. Here are some things
+to consider:
+
+* Anyone who learns the address of a peer can connect to that peer,
+ download the whole history of the git repository, and any available
+ annexed files. They can also upload new files to the peer, and even
+ remove annexed files from the peer. So consider ways that the address
+ of a peer might be exposed.
+
+* While Tor can be used to anonymize who you are, git defaults to including
+ your name and email address in git commit messages. So if you want an
+ anonymous git-annex repository, you'll need to configure git not to do
+ that.
+
+* Using Tor prevents listeners from decrypting your traffic. But, they'll
+ probably still know you're using Tor. Also, by traffic analysis,
+ they may be able to guess if you're using git-annex over tor, and even
+ make guesses about the sizes and types of files that you're exchanging
+ with peers.
+
+* There have been past attacks on the Tor network that have exposed
+ who was running Tor hidden services.
+ <https://blog.torproject.org/blog/tor-security-advisory-relay-early-traffic-confirmation-attack>
+
+* An attacker who can connect to the git-annex Tor hidden service, even
+ without authenticating, can try to perform denial of service attacks.
+
+* Magic wormhole is pretty secure, but the code phrase could be guessed
+ (unlikely) or intercepted. An attacker gets just one chance to try to enter
+ the correct code phrase, before pairing finishes. If the attacker
+ successfully guesses/intercepts both code phrases, they can MITM the
+ pairing process.
+
+ If you don't want to use magic wormhole, you can instead manually generate
+ addresses with `git annex p2p --gen-addresses` and send them over an
+ authenticated, encrypted channel (such as OTR) to a friend to add with
+ `git annex p2p --link`. This may be more secure, if you get it right.
diff --git a/doc/tips/using_Google_Cloud_Storage/comment_8_1b4eb7e0f44865cd5ff0f8ef507d99c1._comment b/doc/tips/using_Google_Cloud_Storage/comment_8_1b4eb7e0f44865cd5ff0f8ef507d99c1._comment
new file mode 100644
index 000000000..1a71f7726
--- /dev/null
+++ b/doc/tips/using_Google_Cloud_Storage/comment_8_1b4eb7e0f44865cd5ff0f8ef507d99c1._comment
@@ -0,0 +1,9 @@
+[[!comment format=mdwn
+ username="scottgorlin@a32946b2aad278883c1690a0753241583a9855b9"
+ nickname="scottgorlin"
+ avatar="http://cdn.libravatar.org/avatar/2dd1fc8add62bbf4ffefac081b322563"
+ subject="Coldline"
+ date="2016-11-21T00:49:23Z"
+ content="""
+Wanted to add that \"storageclass=COLDLINE\" appears to work seamlessly, both from my mac and arm NAS. As far as I can tell, this appears to be a no-brainer vs glacier - builtin git annex client, simpler/cheaper billing, and no 4 hour delay!
+"""]]