summaryrefslogtreecommitdiff
path: root/doc/tips/How_to_retroactively_annex_a_file_already_in_a_git_repo
diff options
context:
space:
mode:
Diffstat (limited to 'doc/tips/How_to_retroactively_annex_a_file_already_in_a_git_repo')
-rw-r--r--doc/tips/How_to_retroactively_annex_a_file_already_in_a_git_repo/comment_7_603db6818d33663b70b917c04fd8485b._comment30
-rw-r--r--doc/tips/How_to_retroactively_annex_a_file_already_in_a_git_repo/comment_8_834410421ccede5194bd8fbaccea8d1a._comment82
2 files changed, 112 insertions, 0 deletions
diff --git a/doc/tips/How_to_retroactively_annex_a_file_already_in_a_git_repo/comment_7_603db6818d33663b70b917c04fd8485b._comment b/doc/tips/How_to_retroactively_annex_a_file_already_in_a_git_repo/comment_7_603db6818d33663b70b917c04fd8485b._comment
new file mode 100644
index 000000000..5527c2b43
--- /dev/null
+++ b/doc/tips/How_to_retroactively_annex_a_file_already_in_a_git_repo/comment_7_603db6818d33663b70b917c04fd8485b._comment
@@ -0,0 +1,30 @@
+[[!comment format=mdwn
+ username="https://launchpad.net/~stephane-gourichon-lpad"
+ nickname="stephane-gourichon-lpad"
+ avatar="http://cdn.libravatar.org/avatar/02d4a0af59175f9123720b4481d55a769ba954e20f6dd9b2792217d9fa0c6089"
+ subject=""Hmm, guyz? Are you serious with these scripts?" Well, what's the matter?"
+ date="2016-11-15T10:58:32Z"
+ content="""
+## Wow, scary
+
+Dilyin's comment is scary. It suggests bad things can happen, but is not very clear.
+
+Bloated history is one thing.
+Obviously broken repo is bad but can be (slowly) recovered from remotes.
+Subtly crippled history that you don't notice can be a major problem (especially once you have propagated it to all your remotes to \"recover from bloat\").
+
+## More common than it seems
+
+There's a case probably more common than people actually report: mistakenly doing `git add` instead of `git annex add` and realizing it only after a number of commits. Doing `git annex add` at that time will have the file duplicated (regular git and annex).
+
+Extra wish: when doing `git annex add` of a file that is already present in git history, `git-annex` could notice and tell.
+
+## Simple solution?
+
+Can anyone elaborate on the scripts provided here, are they safe? What can happen if improperly used or in corner cases?
+
+* \"files are replaced with symlinks and are in the index\" -> so what ?
+* \"Make sure that you don't have annex.largefiles settings that would prevent annexing the files.\" -> What would happen? Also `.gitattributes`.
+
+Thank you.
+"""]]
diff --git a/doc/tips/How_to_retroactively_annex_a_file_already_in_a_git_repo/comment_8_834410421ccede5194bd8fbaccea8d1a._comment b/doc/tips/How_to_retroactively_annex_a_file_already_in_a_git_repo/comment_8_834410421ccede5194bd8fbaccea8d1a._comment
new file mode 100644
index 000000000..2c36962aa
--- /dev/null
+++ b/doc/tips/How_to_retroactively_annex_a_file_already_in_a_git_repo/comment_8_834410421ccede5194bd8fbaccea8d1a._comment
@@ -0,0 +1,82 @@
+[[!comment format=mdwn
+ username="StephaneGourichon"
+ avatar="http://cdn.libravatar.org/avatar/8cea01af2c7a8bf529d0a3d918ed4abf"
+ subject="Walkthrough of a prudent retroactive annex."
+ date="2016-11-24T11:27:59Z"
+ content="""
+Been using the one-liner. Despite the warning, I'm not dead yet.
+
+There's much more to do than the one-liner.
+
+This post offers instructions.
+
+# First simple try: slow
+
+Was slow (estimated >600s for 189 commits).
+
+# In tmpfs: about 6 times faster
+
+I have cloned repository into /run/user/1000/rewrite-git, which is a tmpfs mount point. (Machine has plenty of RAM.)
+
+There I also did `git annex init`, git-annex found its state branches.
+
+On second try I also did
+
+ git checkout -t remotes/origin/synced/master
+
+So that filter-branch would clean that, too.
+
+There, `filter-branch` operation finished in 90s first try, 149s second try.
+
+`.git/objects` wasn't smaller.
+
+# Practicing reduction on clone
+
+This produced no visible benefit:
+
+time git gc --aggressive
+time git repack -a -d
+
+Even cloning and retrying on clone. Oh, but I should have done `git clone file:///path` as said on git-filter-branch man page's section titled \"CHECKLIST FOR SHRINKING A REPOSITORY\"
+
+This (as seen on https://rtyley.github.io/bfg-repo-cleaner/ ) was efficient:
+
+ git reflog expire --expire=now --all && git gc --prune=now --aggressive
+
+`.git/objects` shrunk from 148M to 58M
+
+All this was on a clone of the repo in tmpfs.
+
+# Propagating cleaned up branches to origin
+
+This confirmed that filter-branch did not change last tree:
+
+ git diff remotes/origin/master..master
+ git diff remotes/origin/synced/master synced/master
+
+This, expectedly, was refused:
+
+ git push origin master
+ git push origin synced/master
+
+On origin, I checked out the hash of current master, then on tmpfs clone
+
+ git push -f origin master
+ git push -f origin synced/master
+
+Looks good.
+
+I'm not doing the aggressive shrink now, because of the \"two orders of magnitude more caution than normal filter-branch\" recommended by arand.
+
+# Now what? Check if precious not broken
+
+I'm planning to do the same operation on the other repos, then :
+
+* if everything seems right,
+* if `git annex sync` works between all those fellows
+* etc,
+* then I would perform the reflog expire, gc prune on some then all of them, etc.
+
+Joey, does this seem okay? Any comment?
+
+"""]]