summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGravatar Joey Hess <joey@kitenet.net>2013-12-02 15:44:27 -0400
committerGravatar Joey Hess <joey@kitenet.net>2013-12-02 15:44:27 -0400
commit1bc39540ad74ce601b0e33f1d64345a8630daa1e (patch)
tree134652888e20d8ce14581615571c2f235f2cf520
parent7a3aafe76274a9e54395acb8ec2ce351888495de (diff)
parenteb996a59e7998dc61df6a1aac2e18028a1a71b94 (diff)
Merge branch 'master' of ssh://git-annex.branchable.com
-rw-r--r--doc/bugs/incremental_fsck_should_not_use_sticky_bit/comment_1_204f45a43cd10bcb45c4920a13d66e8d._comment28
-rw-r--r--doc/bugs/incremental_fsck_should_not_use_sticky_bit/comment_2_a8bb264cb2ceece72e0dd9191b2b566e._comment8
-rw-r--r--doc/bugs/incremental_fsck_should_not_use_sticky_bit/comment_3_c6c8d3c84afa497bfdfe25b492dac5b9._comment12
-rw-r--r--doc/forum/Can_Not_Sync_to_Git_Repo/comment_1_80344c54804ddee81d89c0b40731fb9c._comment8
4 files changed, 56 insertions, 0 deletions
diff --git a/doc/bugs/incremental_fsck_should_not_use_sticky_bit/comment_1_204f45a43cd10bcb45c4920a13d66e8d._comment b/doc/bugs/incremental_fsck_should_not_use_sticky_bit/comment_1_204f45a43cd10bcb45c4920a13d66e8d._comment
new file mode 100644
index 000000000..b061224e9
--- /dev/null
+++ b/doc/bugs/incremental_fsck_should_not_use_sticky_bit/comment_1_204f45a43cd10bcb45c4920a13d66e8d._comment
@@ -0,0 +1,28 @@
+[[!comment format=mdwn
+ username="http://joeyh.name/"
+ ip="209.250.56.64"
+ subject="comment 1"
+ date="2013-12-02T17:58:55Z"
+ content="""
+See [[todo/incremental_fsck]] for background.
+
+Options:
+
+* using per-object files
+* using sqllite or another relational database
+* using git as the database
+* using a key/value store
+
+Per-object files have the problem that they clutter up the .git/annex/objects/ tree quite a lot with info only used by fsck. We have a similar clutter problem with direct mode mapping files, and for various reasons (including [[direct mode mappings scale badly with thousands of identical files|bugs/\"Adding_4923_files\"_is_really_slow]]) I have been hoping to move that data to a better storage, perhaps a database, eventually. Switching fsck to use a database first might be a good first step to using a database for the more important direct mode mapping storage. If the fsck database goes wrong, the worse that happens is some extra incremental fscking.
+
+Using git as the database is possible. Just store the info in a separate git ref, like the git-annex branch, but that is not synced. But it will use more disk over time. Probably not the best choice.
+
+There are plenty of sqllite interfaces for haskell. They all do have the problem that the C sqlite has to be installed. This will make `cabal install git-annex` harder. It is currently possible to do a `cabal install git-annex` with flags that avoid needing to install any C libraries. This is useful for my sanity, since otherwise people want hand-holding on installing libraries on OSX and stuff.
+
+Perhaps there's a pure haskell key/value store that would be a better choice than sqllite. I do not anticipate git-annex needing complex relational database storage, if we look at everything it needs to store so far, key/value is enough. (The more complicated stuff is stored in git anyway.)
+
+* <http://hackage.haskell.org/package/io-storage> is actually memory only, not suitable
+* <http://hackage.haskell.org/package/acid-state> is I think the gold standard in its area. Strongly typed. Has some Template Haskell stuff but it's optional. Packaged in Debian already, but only on TH capable architectures so that would need to be fixed. Seems to have the disadvantage that it's not really a key/value store, so works by loading the *entire* data into ram. This is a problem since git-annex wants to run in constant memory.
+* <http://hackage.haskell.org/package/keyvaluehash> has some caveats about not being able to remove old keys
+* <http://hackage.haskell.org/package/HongoDB> looks possible, has only had one release in 2011 though and undocumented
+"""]]
diff --git a/doc/bugs/incremental_fsck_should_not_use_sticky_bit/comment_2_a8bb264cb2ceece72e0dd9191b2b566e._comment b/doc/bugs/incremental_fsck_should_not_use_sticky_bit/comment_2_a8bb264cb2ceece72e0dd9191b2b566e._comment
new file mode 100644
index 000000000..ec921ce8f
--- /dev/null
+++ b/doc/bugs/incremental_fsck_should_not_use_sticky_bit/comment_2_a8bb264cb2ceece72e0dd9191b2b566e._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="helmut"
+ ip="89.0.78.89"
+ subject="git as a database"
+ date="2013-12-02T18:24:12Z"
+ content="""
+You said that \"[...] it will use more disk over time. Probably not the best choice.\" But this does not have to be true. Since this git ref is never synced, an update can be forced. The tricky part is in finding good truncation points. Now for fsck, the log can be truncated whenever an incremental fsck is started.
+"""]]
diff --git a/doc/bugs/incremental_fsck_should_not_use_sticky_bit/comment_3_c6c8d3c84afa497bfdfe25b492dac5b9._comment b/doc/bugs/incremental_fsck_should_not_use_sticky_bit/comment_3_c6c8d3c84afa497bfdfe25b492dac5b9._comment
new file mode 100644
index 000000000..17d7a3aab
--- /dev/null
+++ b/doc/bugs/incremental_fsck_should_not_use_sticky_bit/comment_3_c6c8d3c84afa497bfdfe25b492dac5b9._comment
@@ -0,0 +1,12 @@
+[[!comment format=mdwn
+ username="http://joeyh.name/"
+ ip="209.250.56.64"
+ subject="comment 3"
+ date="2013-12-02T19:11:27Z"
+ content="""
+HongoDB doesn't build with current hackage.
+
+keyvaluehash passed some inital load etc tests. The inability to delete old keys/values might not be a significant problem.
+
+However I don't think it has any ACID guarantees; eg a crash at the wrong time could leave the database in some inconsistent state. <https://github.com/Peaker/keyvaluehash/issues/1>
+"""]]
diff --git a/doc/forum/Can_Not_Sync_to_Git_Repo/comment_1_80344c54804ddee81d89c0b40731fb9c._comment b/doc/forum/Can_Not_Sync_to_Git_Repo/comment_1_80344c54804ddee81d89c0b40731fb9c._comment
new file mode 100644
index 000000000..93c4ce779
--- /dev/null
+++ b/doc/forum/Can_Not_Sync_to_Git_Repo/comment_1_80344c54804ddee81d89c0b40731fb9c._comment
@@ -0,0 +1,8 @@
+[[!comment format=mdwn
+ username="Remy"
+ ip="83.87.167.98"
+ subject="comment 1"
+ date="2013-12-02T19:32:18Z"
+ content="""
+i have the same problem on one of my repo's as well.
+"""]]