summaryrefslogtreecommitdiff
path: root/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss
diff options
context:
space:
mode:
Diffstat (limited to 'doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss')
-rw-r--r--doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_10_52364dc5b1b43b51748453d1896e35c6._comment8
-rw-r--r--doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_11_99b4db1841f8630a9c5efd08910e87a3._comment104
-rw-r--r--doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_1_fbb410a54bb0bd82d0953ef58a88600e._comment24
-rw-r--r--doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_2_8007c9ba42a951a4426255ec3c37d961._comment13
-rw-r--r--doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_3_73ecd4cb8ee58a8dfe7cab0e893dbe5b._comment8
-rw-r--r--doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_4_e8a10886a564f35414c30a04335d9d32._comment8
-rw-r--r--doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_5_6a318edfe45c80343d017dc7b4837acb._comment8
-rw-r--r--doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_6_f7a1d9f9d40aff531d873a95d2196edd._comment8
-rw-r--r--doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_7_1724ffdf986301bf37ef7a6d16b6ea8a._comment10
-rw-r--r--doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_8_5470e2f50e6506139ecb1b342371c509._comment10
-rw-r--r--doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_9_e53148a9efa061a825f668a9492182f7._comment10
11 files changed, 0 insertions, 211 deletions
diff --git a/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_10_52364dc5b1b43b51748453d1896e35c6._comment b/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_10_52364dc5b1b43b51748453d1896e35c6._comment
deleted file mode 100644
index c0b286a2b..000000000
--- a/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_10_52364dc5b1b43b51748453d1896e35c6._comment
+++ /dev/null
@@ -1,8 +0,0 @@
-[[!comment format=mdwn
- username="https://www.google.com/accounts/o8/id?id=AItOawmVV_nBwlsyCv53BXoJt8YpCX_wZPfzpyo"
- nickname="Peter"
- subject="Progress"
- date="2013-10-10T01:17:08Z"
- content="""
-Is there any type of script / tool / patch which does the --fast but with a copy instead of only a hard link? Can someone point me towards how I'm supposed to do this? I'm a technical user, however I don't really fancy having try to go learn the source code of git-annex to fix this really bad flaw :-/
-"""]]
diff --git a/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_11_99b4db1841f8630a9c5efd08910e87a3._comment b/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_11_99b4db1841f8630a9c5efd08910e87a3._comment
deleted file mode 100644
index 4e55bd020..000000000
--- a/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_11_99b4db1841f8630a9c5efd08910e87a3._comment
+++ /dev/null
@@ -1,104 +0,0 @@
-[[!comment format=mdwn
- username="https://www.google.com/accounts/o8/id?id=AItOawmVV_nBwlsyCv53BXoJt8YpCX_wZPfzpyo"
- nickname="Peter"
- subject="Productive Annoyance"
- date="2013-10-10T04:30:47Z"
- content="""
-Ok, so I'm annoyed by this enough (and desperate enough to want to get my data back) that I wrote up a few scripts to help with this. I make no claims regarding how well these will work, but they seem to work with some minimal testing on a Fedora 17 machine.
-
-READ THROUGH THESE SCRIPTS BEFORE RUNNING THEM TO MAKE SURE YOU ARE OK WITH WHAT THEY ARE DOING!!!
-
-First, a script to create a bad git-annex: one with missing files (with a few corner case names) after a git unannex. Specify the directory you'd like to make the annex at the top of the file. ALL CONTENTS OF THIS DIRECTORY WILL BE REMOVED!!!
-
- #!/bin/bash
-
- #This is the folder you'd like to create and unannex
- FOLDERTOUNANNEX='/tmp/badAnnex'
-
- pushd .
-
- if [ ! -d \"$FOLDERTOUNANNEX\" ] ; then
- mkdir \"$FOLDERTOUNANNEX\"
- fi
-
- cd \"$FOLDERTOUNANNEX\"
-
- rm -rf *
-
- mkdir subdir
- echo \"hi\" > 1one.txt
- echo \"hi\" > 2two.txt
- echo \"hi\" > \"3thr re ee.boo\"
- echo \"hi\" > \"4f o u r.boo\"
- echo \"hi\" > 5
- echo \"hi\" > \"6\"
- echo \"hi\" > \"subdir/7\"
- echo \"hi\" > \"subdir/8.cat\"
- echo \"hi\" > \"subdir/9.cat\"
-
- echo \"* annex.backend=SHA512E\" > .gitattributes
-
- chmod g-r 5 6
- chmod o-r 6
-
- ls -la
-
- git init
- git annex init \"stupid\"
- git annex add *
- ls -la
- git annex unannex *
- ls -la
-
- popd
-
-
-Then, a script to recover the files left missing by the above script. Note this might be very slow as it has to generate SHA512 hashes for all the files in your annex. Again, change the paths at the top of this file to work in your environment:
-
- #!/bin/bash
-
- #Set this to some place outside your annex, where we can store our hashes while we search for them
- #It will be fastest if this is on a different physical disk than the annexed folder
- #You can manually delete the file afterwards
- HASHFILE='/backup3/tmp.sha'
- #This is the folder you'd like to unannex
- FOLDERTOUNANNEX='/tmp/badAnnex'
-
-
-
-
- HASHLEN=128
-
- pushd .
- cd \"$FOLDERTOUNANNEX\"
-
- find \"$FOLDERTOUNANNEX\" ! -path '*.git*' -exec sha512sum \{\} \; > \"$HASHFILE\"
-
- find -L \"$FOLDERTOUNANNEX\" -type l | while read BROKENFILE; do
- POINTSTO=`file \"$BROKENFILE\" | sed -r 's/^.*broken symbolic link to .(.*).$/\1/g'`
-
- HASH=`echo \"$POINTSTO\" | sed -r \"s/^.*--([^-\/.]{$HASHLEN}).*$/\1/g\"`
-
- EXT=`echo \"$POINTSTO\" | sed -r \"s/^.*--[^-\/.]{$HASHLEN}(.[^.]+)?$/\1/g\"`
-
- echo \"-\"
- echo \"FILE:$BROKENFILE\"
- echo \"POINTSTO:$POINTSTO\"
- echo \"HASH:$HASH\"
- echo \"EXT:$EXT\"
-
- SOURCEFILE=`grep $HASH $HASHFILE | grep -m 1 \"$EXT\" | sed -r \"s/^.{$HASHLEN} (.*)$/\1/g\"`
-
- echo \"SOURCEFILE:$SOURCEFILE\"
- if [ -f \"$SOURCEFILE\" ];
- then
- cp --backup --suffix=\"~GIT_ANNEX_IS_DANGEROUS~\" -a \"$SOURCEFILE\" \"$BROKENFILE\"
- else
- echo \"ERROR: Cant find sourcefile\"
- fi
- done;
-
- popd
-
-I have not yet run this repair script on my rather large broken annex. I cannot seem to figure out how to restore file ownership and permissions which seem to have been lost when the second file is just linked to the matching previously annexed file (note: this is visible after \"fixing\" the bad annex created by the first script above in that after \"fixing\" file \"6\" is readable by other, whereas originally he was NOT readable by other. The permissions of 6 have been copied from 5.) Any thoughts or improvements on this are appreciated.
-"""]]
diff --git a/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_1_fbb410a54bb0bd82d0953ef58a88600e._comment b/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_1_fbb410a54bb0bd82d0953ef58a88600e._comment
deleted file mode 100644
index 14172a3e5..000000000
--- a/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_1_fbb410a54bb0bd82d0953ef58a88600e._comment
+++ /dev/null
@@ -1,24 +0,0 @@
-[[!comment format=mdwn
- username="https://www.google.com/accounts/o8/id?id=AItOawlup4hyZo4eCjF8T85vfRXMKBxGj9bMdl0"
- nickname="Ben"
- subject="comment 1"
- date="2012-09-06T02:28:00Z"
- content="""
-
-Here is a quick script which reproduces the issue on another Ubuntu 12.04 machine,
-
- mkdir hi
- cd hi
- wget \"http://downloads.sourceforge.net/project/free-cad/FreeCAD%20Source/freecad-0.11.3729.tar.gz\"
-
- git init
- git annex init
- tar -zxf freecad-0.11.3729.tar.gz
- git annex add FreeCAD-0.11.3729
- git annex unannex FreeCAD-0.11.3729
- echo \"The following links are broken:\"
- find -L . -type l
-
-This results in dozens of dead symlinks.
-
-"""]]
diff --git a/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_2_8007c9ba42a951a4426255ec3c37d961._comment b/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_2_8007c9ba42a951a4426255ec3c37d961._comment
deleted file mode 100644
index e1f600d88..000000000
--- a/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_2_8007c9ba42a951a4426255ec3c37d961._comment
+++ /dev/null
@@ -1,13 +0,0 @@
-[[!comment format=mdwn
- username="http://joeyh.name/"
- ip="4.152.108.236"
- subject="comment 2"
- date="2012-09-06T14:55:58Z"
- content="""
-What's going on here is you have multiple files with the same content, so the symlinks point to the same annexed file. When unannex processes the first symlink, it moves the annexed file to replace it. This breaks the other symlink that pointed to it. Notice that if you then re-add the file to the annex, the broken symlink automatically gets fixed -- there's no actual data loss going on here.
-
-This problem can be avoided by using `git annex unannex --fast`, which makes hardlinks to the annexed file.
-But then you are also left with the hard links in `.git/annex/objects`.. `git annex unused` can find and remove them.
-
-It may make sense to make the current \"--fast\" behavior the default for unannex..
-"""]]
diff --git a/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_3_73ecd4cb8ee58a8dfe7cab0e893dbe5b._comment b/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_3_73ecd4cb8ee58a8dfe7cab0e893dbe5b._comment
deleted file mode 100644
index 2a799fac0..000000000
--- a/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_3_73ecd4cb8ee58a8dfe7cab0e893dbe5b._comment
+++ /dev/null
@@ -1,8 +0,0 @@
-[[!comment format=mdwn
- username="https://www.google.com/accounts/o8/id?id=AItOawlup4hyZo4eCjF8T85vfRXMKBxGj9bMdl0"
- nickname="Ben"
- subject="comment 3"
- date="2012-09-06T16:04:42Z"
- content="""
-Frankly, even the --fast behavior has an element of surprise to it. For example, one might have two files with identical content. Upon annexing and unannex they suddenly become a hard link to the same file, correct? If this is the case, changes to one will result in changes to the other. I would consider this a very nasty sort of surprise.
-"""]]
diff --git a/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_4_e8a10886a564f35414c30a04335d9d32._comment b/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_4_e8a10886a564f35414c30a04335d9d32._comment
deleted file mode 100644
index 72b4c5c7f..000000000
--- a/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_4_e8a10886a564f35414c30a04335d9d32._comment
+++ /dev/null
@@ -1,8 +0,0 @@
-[[!comment format=mdwn
- username="http://joeyh.name/"
- ip="4.153.8.30"
- subject="comment 4"
- date="2012-09-09T16:53:35Z"
- content="""
-Perhaps the solution is to make --fast the default and to make it copy files when the content in the annex already has a hard link to it.
-"""]]
diff --git a/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_5_6a318edfe45c80343d017dc7b4837acb._comment b/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_5_6a318edfe45c80343d017dc7b4837acb._comment
deleted file mode 100644
index fbc30f17b..000000000
--- a/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_5_6a318edfe45c80343d017dc7b4837acb._comment
+++ /dev/null
@@ -1,8 +0,0 @@
-[[!comment format=mdwn
- username="https://www.google.com/accounts/o8/id?id=AItOawlup4hyZo4eCjF8T85vfRXMKBxGj9bMdl0"
- nickname="Ben"
- subject="comment 5"
- date="2012-09-09T20:47:04Z"
- content="""
-That sounds far more reasonable.
-"""]]
diff --git a/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_6_f7a1d9f9d40aff531d873a95d2196edd._comment b/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_6_f7a1d9f9d40aff531d873a95d2196edd._comment
deleted file mode 100644
index 295411d25..000000000
--- a/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_6_f7a1d9f9d40aff531d873a95d2196edd._comment
+++ /dev/null
@@ -1,8 +0,0 @@
-[[!comment format=mdwn
- username="https://www.google.com/accounts/o8/id?id=AItOawlup4hyZo4eCjF8T85vfRXMKBxGj9bMdl0"
- nickname="Ben"
- subject="comment 6"
- date="2012-09-19T23:32:35Z"
- content="""
-Has any progress been made here? While this issue may not result in data loss, the behavior documented in this bug is certainly surprising and does not instill confidence in new users.
-"""]]
diff --git a/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_7_1724ffdf986301bf37ef7a6d16b6ea8a._comment b/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_7_1724ffdf986301bf37ef7a6d16b6ea8a._comment
deleted file mode 100644
index fd235321a..000000000
--- a/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_7_1724ffdf986301bf37ef7a6d16b6ea8a._comment
+++ /dev/null
@@ -1,10 +0,0 @@
-[[!comment format=mdwn
- username="http://joeyh.name/"
- ip="4.153.14.141"
- subject="comment 7"
- date="2012-09-23T18:02:45Z"
- content="""
-If unannex makes the file a hard link to the annexed content, it will be mode 444 or so. But if the user changes the permissions and modifys it, that will corrupt the content still in the annex!
-
-So the current --fast behavior seems no worse than the proposed behavior. And it's not at all clear to me that this would be a better default behavior for unannex than the current behavior, which at least ensures that data left in the annex (and referred to by another annexed file) cannot be corrupted.
-"""]]
diff --git a/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_8_5470e2f50e6506139ecb1b342371c509._comment b/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_8_5470e2f50e6506139ecb1b342371c509._comment
deleted file mode 100644
index 7ac71b6b8..000000000
--- a/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_8_5470e2f50e6506139ecb1b342371c509._comment
+++ /dev/null
@@ -1,10 +0,0 @@
-[[!comment format=mdwn
- username="https://www.google.com/accounts/o8/id?id=AItOawnRai_qFYPVvEgC6i1nlM1bh-C__jbhqS0"
- nickname="Matthew"
- subject="comment 8"
- date="2013-07-31T14:17:19Z"
- content="""
-Filenames are the index which users use to find their data.
-
-Leaving a broken symlink may not result in technical data loss, but can quite possibly result in the user being unable to find the data which was referenced by that filename (symlink), so in that case that data _is_ lost, in the true sense of the word (the user cannot find it). Telling the user their data exists _somewhere_ is not actually making the situation any better.
-"""]]
diff --git a/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_9_e53148a9efa061a825f668a9492182f7._comment b/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_9_e53148a9efa061a825f668a9492182f7._comment
deleted file mode 100644
index 74aaa1e56..000000000
--- a/doc/bugs/Large_unannex_operations_result_in_stale_symlinks_and_data_loss/comment_9_e53148a9efa061a825f668a9492182f7._comment
+++ /dev/null
@@ -1,10 +0,0 @@
-[[!comment format=mdwn
- username="https://me.yahoo.com/a/2grhJvAC049fJnvALDXek.6MRZMTlg--#eec89"
- nickname="John"
- subject="comment 9"
- date="2013-08-30T05:59:28Z"
- content="""
-I'll chime in and say that the non-fast behavior being the default seems wrong, and making hard-link invisibly seems wrong. What Joey proposed -- copying a file if there are multiple hard-links -- seems like the right solution.
-
-Just recently I tried to unannex a large repository and was bitten by now-dangling symlinks to files that I couldn't locate anymore. The fact is that the current unannex operation is too dangerous to be useful.
-"""]]