doc/devblog/day_209__mass_conversion.mdwn


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32

Have started converting lots of special remotes to the new API. Today, S3
and hook got chunking support. I also converted several remotes to the new
API without supporting chunking: bup, ddar, and glacier (which should
support chunking, but there were complications). 

This removed 110 lines of code while adding features! And,
I seem to be able to convert them faster than `testremote` can test them. :)

Now that S3 supports chunks, they can be used to work around several
problems with S3 remotes, including file size limits, and a memory leak in
the underlying S3 library.

The S3 conversion included caching of the S3 connection when
storing/retrieving chunks. [Update: Actually, it turns out it didn't;
the hS3 library doesn't support persistent connections. Another reason I
need to switch to a better S3 library!] 

But the API doesn't yet support caching
when removing or checking if chunks are present. I should probably expand
the API, but got into some type checker messes when using generic enough
data types to support everything. Should probably switch to `ResourceT`.

Also, I tried, but failed to make `testremote` check that storing a key
is done atomically. The best I could come up with was a test that stored a
key and had another thread repeatedly check if the object was present on
the remote, logging the results and timestamps. It then becomes a
statistical problem -- somewhere toward the end of the log it's ok if the key
has become present -- but too early might indicate that it wasn't stored
atomically. Perhaps it's my poor knowledge of statistics, but I could not
find a way to analize the log that reliably detected non-atomic storage.
If someone would like to try to work on this, see the `atomic-store-test`
branch.