aboutsummaryrefslogtreecommitdiff
path: root/doc/todo/Wishlist__58___Parity_files_on_all_files.mdwn
diff options
context:
space:
mode:
authorGravatar git-annex@31849d241f10c295b30a9707352ae5c7d743adb7 <git-annex@web>2017-01-24 17:15:32 +0000
committerGravatar admin <admin@branchable.com>2017-01-24 17:15:32 +0000
commit12da138eeb6aa94782d7e0bf35e350711b2836b6 (patch)
tree570bde854a709fa3a3736102d2a4f7a033b909a5 /doc/todo/Wishlist__58___Parity_files_on_all_files.mdwn
parentce54a4a0baa26d3246f933585aa273936a8e7171 (diff)
Diffstat (limited to 'doc/todo/Wishlist__58___Parity_files_on_all_files.mdwn')
-rw-r--r--doc/todo/Wishlist__58___Parity_files_on_all_files.mdwn65
1 files changed, 65 insertions, 0 deletions
diff --git a/doc/todo/Wishlist__58___Parity_files_on_all_files.mdwn b/doc/todo/Wishlist__58___Parity_files_on_all_files.mdwn
new file mode 100644
index 000000000..01ebf8ce5
--- /dev/null
+++ b/doc/todo/Wishlist__58___Parity_files_on_all_files.mdwn
@@ -0,0 +1,65 @@
+To make sure we can archive our data safely, we need to:
+
+- Store revisions
+- Allow files to be tracked while moved to archival spaces
+- Be platform-agnostic
+- Sync
+- Protect against bit-rot
+
+
+1 and 3 are handled by git itself; everything is a straight forward graph-structure comprised of plain text pointers *(accepting that some filesystems do not easily expose file metadata, but that's on them as we can simply chose to use a different system if that's important)
+
+2 and 4 seem to be handled by git-annex
+
+**But 5 is missing.**
+
+
+Thankfully, we already have a technology that can fill in elegantly here: parity files.
+
+
+### 2 potential user stories:
+
+#### Put everything together
+
+- This user wants everything together and in the filesystem in case one of the tools she relies on disappears.
+- Might have a structure like this:
+ - Project
+ - documents
+ - contract.pdf
+ - contract.pdf.vol000+01.par2
+ - contract.pdf.vol001+02.par2
+ - contract.pdf.vol003+04.par2
+ - Client brochure.zip
+ - Client brochure.zip.vol000+01.par2
+ - Client brochure.zip.vol001+02.par2
+ - Client brochure.zip.vol003+04.par2
+
+- Or like this:
+ - Project
+ - documents
+ - contract.pdf
+ - Client brochure.zip
+ - documents.vol000+01.par2
+ - documents.vol001+02.par2
+ - documents.vol003+04.par2
+
+
+
+#### Keep everything clean
+
+- This user doesn't want to clutter folders with extra files. He would rather only have the data files themselves in case they need to be zipped and sent to clients. If he had setup 1, he would delete *.par before zipping, leading to potential data loss.
+- Might have a structure like this:
+ - Project
+ - documents
+ - contract.pdf
+ - Client brochure.zip
+ - [git-annex]
+ - contract.pdf.vol000+01.par2
+ - contract.pdf.vol001+02.par2
+ - contract.pdf.vol003+04.par2
+ - Client brochure.zip.vol000+01.par2
+ - Client brochure.zip.vol001+02.par2
+ - Client brochure.zip.vol003+04.par2
+
+
+This would also enhance the data-checking capabilities of git-annex, as data loss could be fixed and new parity files generated from the recovered files transparently, self-healing the archive.