summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGravatar http://christian.amsuess.com/chrysn <chrysn@web>2011-01-19 17:46:08 +0000
committerGravatar admin <admin@branchable.com>2011-01-19 17:46:08 +0000
commit27325f212bfdf915d16eadfa9fc51b416d4177c0 (patch)
tree7bd3a254e99c4b14151996f6e1a34bcfe2244c10
parent1881b4d7fcb50d563d0e6890fc39dde0663d87a3 (diff)
no attachments w/o admin, copy/pasting instead
-rw-r--r--doc/forum/migration_to_git-annex_and_rsync.mdwn22
1 files changed, 21 insertions, 1 deletions
diff --git a/doc/forum/migration_to_git-annex_and_rsync.mdwn b/doc/forum/migration_to_git-annex_and_rsync.mdwn
index 51e03de0d..d99dab872 100644
--- a/doc/forum/migration_to_git-annex_and_rsync.mdwn
+++ b/doc/forum/migration_to_git-annex_and_rsync.mdwn
@@ -1,4 +1,4 @@
-When migrating large file repositories to git-annex that are backuped in a way that uses an rsync-style mechanism (e.g. [dirvish](http://www.dirvish.org/)) and thus keeps incremental backups small by using hardlinks, space can be saved by manually reflecting the migration on the backup. So, instead of making a last pre-git-annex backup, migrating, and duplicating all backupped data with the next backup, I used the attached [[migrate.py]], and it saved me roughly a day of backuping.
+When migrating large file repositories to git-annex that are backuped in a way that uses an rsync-style mechanism (e.g. [dirvish](http://www.dirvish.org/)) and thus keeps incremental backups small by using hardlinks, space can be saved by manually reflecting the migration on the backup. So, instead of making a last pre-git-annex backup, migrating, and duplicating all backupped data with the next backup, I used the <del>attached</del> migrate.py file below, and it saved me roughly a day of backuping.
A note on terminology: "migrating" here means migrating from not using git-annex at all to using it, not to the ``git annex migrate`` command, for which a similar but different solution may be created.
@@ -11,3 +11,23 @@ First, have an up-to-date backup; then, git annex init / add etc as described in
Then copy the resulting migrate.sh to the equivalent location inside your backups and run it there. It will move all files that are now symlinked on the master to their new positions according to the symlinks (inside .git/annex/objects), but not create the symlinks (you will do a backup later anyway).
After that, do a backup as usual. As rsync sees the moved files at their new locations, it will accept them and not duplicate the data.
+
+**migrate.py**:
+
+ #!/usr/bin/env python
+
+ import os
+ from pipes import quote
+
+ print "#!/bin/sh"
+ print "set -e"
+ print ""
+
+ for (dirpath, dirnames, filenames) in os.walk("."):
+ for f in filenames:
+ fn = os.path.join(dirpath, f)
+ if os.path.islink(fn):
+ link = os.path.normpath(os.path.join(dirpath, os.readlink(fn)))
+ assert link.startswith(".git/annex/objects/")
+ print "mkdir -p %s"%quote(os.path.dirname(link))
+ print "mv %s %s"%(quote(fn), quote(link))