diff options
3 files changed, 95 insertions, 15 deletions
diff --git a/doc/bugs/incremental_fsck_should_not_use_sticky_bit/comment_8_9f5acbc79b631a93d7cdf4ae37c07cab._comment b/doc/bugs/incremental_fsck_should_not_use_sticky_bit/comment_8_9f5acbc79b631a93d7cdf4ae37c07cab._comment new file mode 100644 index 000000000..a63381ed0 --- /dev/null +++ b/doc/bugs/incremental_fsck_should_not_use_sticky_bit/comment_8_9f5acbc79b631a93d7cdf4ae37c07cab._comment @@ -0,0 +1,19 @@ +[[!comment format=mdwn + username="joey" + subject="""vache""" + date="2015-02-15T16:35:12Z" + content=""" +<https://github.com/dmbarbour/haskell-vcache> + +It uses LMDB, which is a C library, and its PVar is a variable named by a +bytestring, so it's essentially a key/value store where the values can be +arbitrary Haskell data types. Since git-annex already has Keys, and most +of the need for the database is to look up some cached value for a Key, +this seems like a pretty good fit! + +Unfortunately, "A VCache file may be opened by only one process at a time, +and only once within said process." + +But, git-annex needs multiple read-only access, as many git-annex processes +can run concurrently. +"""]] diff --git a/doc/design/caching_database.mdwn b/doc/design/caching_database.mdwn index a6face00f..8678f8c3d 100644 --- a/doc/design/caching_database.mdwn +++ b/doc/design/caching_database.mdwn @@ -25,11 +25,11 @@ Store in the database the Ref of the branch that was used to construct it. ## implementation plan -1. Implement for metadata, on a branch, with sqlite. +1. Store incremental fsck info in db, on a branch, with sqlite. 2. Make sure that builds on all platforms. -3. Add associated file mappings support. This is needed to fully +3. Implement for metadata, on a branch, with sqlite. +4. Add associated file mappings support. This is needed to fully use the caching database to construct views. -4. Store incremental fsck info in db. 5. Replace .map files with 3. for direct mode. ## sqlite or not? @@ -39,19 +39,20 @@ SQL. And even if that's hidden by a layer like persistent, it's still going to involve some technical debt (eg, database migrations). It would be great if there were some haskell thing like acid-state -that I could use instead. But, acid-sate needs to load the whole +that I could use instead. But, acid-state needs to load the whole DB into memory. In the comments of [[bugs/incremental_fsck_should_not_use_sticky_bit]] I examined several other haskell database-like things, and found them all wanting, except for -possibly TCache. +possibly TCache. (And TCache is backed by persistent/sqlite anyway.) -TODO: This seems promising; investigate it: -<https://awelonblue.wordpress.com/2014/12/19/vcache-an-acid-state-killer/> -It uses LMDB, which is a C library, and its PVar is a variable named by a -bytestring, so it's essentially a key/value store where the values can be -arbitrary Haskell data types. Since git-annex already has Keys, and most -of the need for the database is to look up some cached value for a Key, -this seems like a pretty good fit! +## one db or multiple? + +Using a single database will use less space. Eg, each Key will only need to +appear in it once, with proper normalization. + +OTOH, it's more complicated, and harder to recover from problems. + +Currently leaning toward one database per purpose. ## case study: persistent with sqllite @@ -128,15 +129,19 @@ With this, 1000 queries takes 0.406s. Note that persistent is probably not actually doing a join at the SQL level, so this could be sped up using eg, esquelito. -Update2: Using esquelito to do a join got this down to 0.250s. +Update2: Using esquelito to do a join got this down to 0.109s. +See `database` branch for code. -Code: <http://lpaste.net/101141> <http://lpaste.net/101142> +Update3: Converting to a single un-normalized table for AssociatedFiles +avoids the join, and increased lookup speed to 0.087s. Of course, when +a key has multiple associated files, this will use more disk space, due +to not normalizing the key. Compare the above with 1000 calls to `associatedFiles`, which is approximately as fast as just opening and reading 1000 files, so will take well under 0.05s with a **cold** cache. -So, we're looking at nearly an order of magnitude slowdown using sqlite and +So, we're looking at maybe 50% slowdown using sqlite and persistent for associated files. OTOH, the normalized schema should perform better when adding an associated file to a key that already has many. diff --git a/doc/devblog/day_253__sqlite_for_incremental_fsck.mdwn b/doc/devblog/day_253__sqlite_for_incremental_fsck.mdwn new file mode 100644 index 000000000..e29f2ebb1 --- /dev/null +++ b/doc/devblog/day_253__sqlite_for_incremental_fsck.mdwn @@ -0,0 +1,56 @@ +Yesterday I did a little more investigation of key/value stores. +I'd love a pure haskell key/value store that didn't buffer everything in +memory, and that allowed concurrent readers, and was ACID, and production +quality. But so far, I have not found anything that meets all those +criteria. It seems that sqlite is the best choice for now. + +Started working on the `database` branch today. The plan is to use +sqlite for incremental fsck first, and if that works well, do the rest +of what's planned in [[design/caching_database]]. + +At least for now, I'm going to use a dedicated database file for each +different thing. (This may not be as space-efficient due to lacking +normalization, but it keeps things simple.) + +So, .git/annex/fsck.db will be used by incremental fsck, and it has +a super simple Persistent database schema: + +[[!format haskell """ +Fscked + key SKey + UniqueKey key +"""]] + +It was pretty easy to implement this and make incremental fsck use it. The +hard part is making it both fast and robust. + +At first, I was doing everything inside a single `runSqlite` action. +Including creating the table. But, it turns out that runs as a single +transaction, and if it was interrupted, this left the database in a +state where it exists, but has no tables. Hard to recover from. + +So, I separated out creating the database, made that be done in a separate +transation and fully atomically. Now `fsck --incremental` could be crtl-c'd +and resumed with `fsck --more`, but it would lose the transaction and so +not remember anything had been checked. + +To fix that, I tried making a separate transation per file fscked. That +worked, and it resumes nicely where it left off, but all those transactions +made it much slower. + +To fix the speed, I made it commit just one transaction per minute. This +seems like an ok balance. Having fsck re-do one minute's work when restarting +an interrupted incremental fsck is perfectly reasonable, and now the speed, +using the sqlite database, is nearly as fast as the old sticky bit hack was. +(Specifically, 6m7s old vs 6m27s new, fscking 37000 files from cold cache +in --fast mode.) + +There is still a problem with multiple concurrent `fsck --more` +failing. Probably a concurrent writer problem? And, some porting will be +required to get sqlite and persistent working on Windows and Android. +So the branch isn't ready to merge yet, but it seems promising. + +In retrospect, while incremental fsck has the simplest database schema, it +might be one of the harder things listed in [[design/caching_database]], +just because it involves so many writes to the database. The other use +cases are more read heavy. |