summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGravatar Joey Hess <joeyh@joeyh.name>2017-10-25 00:02:53 -0400
committerGravatar Joey Hess <joeyh@joeyh.name>2017-10-25 00:02:53 -0400
commitda097bffe136bc25c9a63f021c793ce7e64dc329 (patch)
tree3cfe4b9a490b81230743a1dc85a1f4959caa4d04
parent0134d77eeb57bd9c493fc9134d94bcdc6d745acc (diff)
parent1cace8313b8047786e736efbf9793f9f63f48118 (diff)
Merge branch 'master' of ssh://git-annex.branchable.com
-rw-r--r--doc/bugs/fix_git-annex_paths___47___objects___40__repository_not_available__41__.mdwn87
-rw-r--r--doc/bugs/graft__47__graft_cleanup_commits_--_really_needed__63__.mdwn59
2 files changed, 146 insertions, 0 deletions
diff --git a/doc/bugs/fix_git-annex_paths___47___objects___40__repository_not_available__41__.mdwn b/doc/bugs/fix_git-annex_paths___47___objects___40__repository_not_available__41__.mdwn
new file mode 100644
index 000000000..3bd64aeed
--- /dev/null
+++ b/doc/bugs/fix_git-annex_paths___47___objects___40__repository_not_available__41__.mdwn
@@ -0,0 +1,87 @@
+### Please describe the problem.
+I cloned my git-annex repository to a bare repository on my server (and deleted the original to reinstall the OS). When I try to clone back to a new machine
+
+ $ git-annex get .
+ get programs/2017-06-drafts/About.txt (not available)
+ Try making some of these repositories available:
+ 8ddb8c4d-06ac-4c93-bf28-15639e0ea600 -- MacBook
+ failed
+
+and so for many files I committed. The files are actually on the server as I can see from the size of the repo and I remember them being copied there.
+
+When I strace that command, I see that it stats a missing file
+
+ stat(".git/annex/objects/7v/5x/SHA256E-s81068--de1d8de99645d74ba1ea186b6cabd1fc116cb6c1823130756f33ff81807815ed.pdf/SHA256E-s81068--de1d8de99645d74ba1ea186b6cabd1fc116cb6c1823130756f33ff81807815ed.pdf", 0x200015bf0) = -1 ENOENT (No such file or directory)
+
+`.git/annex/objects` is absent in the cloned repository on the server (I test it there - on my machine it doesn't work either).
+
+However I can find that the SHA file really exists
+
+ $ locate SHA256E-s81068--de1d8de99645d74ba1ea186b6cabd1fc116cb6c1823130756f33ff81807815ed.pdf/SHA256E-s81068--de1d8de99645d74ba1ea186b6cabd1fc116cb6c1823130756f33ff81807815ed.pdf
+ /home/yaroslav/work.git/d0d/994/SHA256E-s81068--de1d8de99645d74ba1ea186b6cabd1fc116cb6c1823130756f33ff81807815ed.pdf/SHA256E-s81068--de1d8de99645d74ba1ea186b6cabd1fc116cb6c1823130756f33ff81807815ed.pdf
+
+and that contains the necessary data.
+
+So I wrote this bash script to recover some files from the corrupt repo (put there your parameters like directory names etc)
+
+ fix_obj.sh
+[[!format bash """
+NEWDIR=new
+while IFS= read -r -d $'\0';
+do
+ trueloc=`sed 's/\/annex\/objects\///g' <"$REPLY"`
+ # if you had swap files from editors, they may appear here. Fix them in advance.
+ # echo "$REPLY"':'$trueloc
+ truefil=`locate $trueloc/$trueloc | grep work.git-copy`
+ mkdir -p $NEWDIR/`dirname $REPLY`
+ cp -p $truefil $NEWDIR/$REPLY
+done < <(find $1/* -type f -print0)
+"""]]
+
+The cycle on all files is not a simple "for dir in \`find $1\`" because of possible newlines and spaces in directories. For my directories that worked, but I'm still not sure about possible bugs in that (actually it complains several times, but seems to work). The script doesn't work in sh, but can be launched via e.g.
+
+ . fix_obj.sh programs
+
+where 'programs' is a subdirectory (without a backslash) in your git repo that you want to recover.
+
+I don't know how this situation occured and how to fix that in other way. I tried `git gc`, `git-annex fsck`, `git-annex repair`, of course cloned (git and git-annex) and other things, but that didn't help.
+
+I've read [disaster recovery](https://git-annex.branchable.com/design/assistant/disaster_recovery/)
+
+> # git repository repair
+> There are several ways git repositories can get damanged.
+> The most common is empty files in .git/annex/objects and commits that refer to those objects. When the objects have not yet been pushed anywhere. I've several times recovered from this manually by removing the bad files and resetting to before the commits that referred to them. Then re-staging any divergence in the working tree. This could perhaps be automated.
+> ...
+> This is useful outside git-annex as well, so make it a git-recover-repository command.
+
+I think that would be nice to provide some means to recover from these situations. At least those who face the same problem as me can use the script above.
+
+### What steps will reproduce the problem?
+
+1. Create git-annex on your local machine.
+2. Clone that to a remote server. Unfortunately I don't remember the exact commands - I think that was done with the `rsync` special backend.
+3. \* Delete all the data except on the server (better not do that).
+4. Try to clone that from the server to a new machine.
+
+### What version of git-annex are you using? On what operating system?
+
+On the original machine I used `version 6` with `thin`. git-annex was downloaded from this site as a binary maybe several months ago. The OS was `Fedora Core 24`, the FS was `ext4`.
+
+On the server where I cloned the repo I noticed that the git-annex version was `5`. On the server the git-annex version is `6.20171003-g14ffdd779`. The OS is `CentOS Linux 7 (Core)` (a virtual private server).
+
+### Please provide any additional information below.
+
+I managed to recover a unique part of my data, however I don't know how the repo could be recovered (which would be best). I will combine the data again from available pieces.
+
+I'd also like to add that I'm not a git expert, I use quite few commands from that and now I may know it not better than git-annex. Only recently I realized that `special remotes` don't have a copy of the git repository (I found this only [here](http://git-annex.branchable.com/special_remotes/rsync/#comment-525d3951ab1f09fdf471f450a798b50e) and still can't see that on [special remotes](http://git-annex.branchable.com/special_remotes/) page). That would be great if we could understand not only basic things about e.g. special remotes, but also the underlying facts about git-annex, to better understand possible problems. It's not intuitive that some repositories are cloned via `git clone`, and some via `git-annex initremote` etc, or that could better pronounced if that is the only difference, otherwise things mess up: I still don't quite understand what would be the difference between `git-annex get .` and `git-annex sync --content` (because the latter showed me that my repo above was synced - even though it really missed the files I needed).
+
+git-annex version 6 seems very promising, but I have a feeling that the documentation for the project should be a bit rewritten/restructured, because when I read comments from several years ago, I can't judge whether that is still appropriate or not. Sorry if I should had submitted that in another bug on the documentation.
+
+And maybe one should rename my question, because I can't locate precisely the problem and mostly just used the keywords for a better search.
+
+### Have you had any luck using git-annex before? (Sometimes we get tired of reading bug reports all day and a lil' positive end note does wonders)
+
+Yes, I have a working git-annex repository on a server (with `gcrypt`) and on two laptops of mine. That works fine, though [not blazingly fast](http://git-annex.branchable.com/bugs/v6__58___git_status__47__add_after_unlock_is_linear_to_the_file_size/#comment-5b8baf74551f8cc192d5d87b8d6aefa3).
+> and a lil' positive end note
+
+I think there are no other variants for me. A year ago I tried some 'out-of-the box' popular alternatives, but the only use from that was to better learn SELinux and process isolation. Now I think the best is the \*NIX style where you control everything and you can learn. I've spent many days learning git-annex and hope that will work (though the last link above).
diff --git a/doc/bugs/graft__47__graft_cleanup_commits_--_really_needed__63__.mdwn b/doc/bugs/graft__47__graft_cleanup_commits_--_really_needed__63__.mdwn
new file mode 100644
index 000000000..bd7fa8050
--- /dev/null
+++ b/doc/bugs/graft__47__graft_cleanup_commits_--_really_needed__63__.mdwn
@@ -0,0 +1,59 @@
+### Please describe the problem.
+
+Playing with the new fancy "export" feature. Thanks again!
+
+Somehow I see now commits in the git-annex branch in the quick succession establishing and then killing a "graft".
+Are they really needed/used by annex or could have been squashed/avoided?
+
+some details of the recent history of commands and how git-annex branch looks like are listed below
+
+
+### What version of git-annex are you using? On what operating system?
+
+6.20171018+gitgbb20b1ed3-1~ndall+1
+
+### Please provide any additional information below.
+
+[[!format sh """
+
+14883 tree /tmp/test-directory-export -a
+14887 git annex info
+14888 git annex export --to=directory-export
+14889 git annex export
+14890 git annex export --tracking master --to public-s3
+14891 git annex export --tracking master --to directory-export
+14892 git annex export
+14893 git annex export --to directory-export
+14894 git annex export
+14896 git annex sync --content
+14898 git log --stat git-annex
+
+
+$> git lg --stat git-annex | head -n 30
+* 4a03be8 - (git-annex) update (4 minutes ago) [Yaroslav Halchenko]|
+| export.log | 2 +-
+| 1 file changed, 1 insertion(+), 1 deletion(-)
+
+* b23ae9d - graft cleanup (4 minutes ago) [Yaroslav Halchenko]|
+| export.tree/.datalad/.gitattributes | 3 ---
+| export.tree/.datalad/config | 2 --
+| export.tree/.gitattributes | 1 -
+| export.tree/123 | 1 -
+| export.tree/sub/dir/11 | 1 -
+| 5 files changed, 8 deletions(-)
+
+* 6a0bc5f - graft (4 minutes ago) [Yaroslav Halchenko]|
+| export.tree/.datalad/.gitattributes | 3 +++
+| export.tree/.datalad/config | 2 ++
+| export.tree/.gitattributes | 1 +
+| export.tree/123 | 1 +
+| export.tree/sub/dir/11 | 1 +
+| 5 files changed, 8 insertions(+)
+
+* 919e345 - update (5 minutes ago) [Yaroslav Halchenko]|
+| export.log | 2 +-
+| 1 file changed, 1 insertion(+), 1 deletion(-)
+
+"""]]
+
+[[!meta author=yoh]]