doc/bugs/Issue_fewer_S3_GET_requests/comment_3_6ccbb1cff7bc6b4640220d98f7ce21c3._comment


1
2
3
4
5
6
7
8
9
10
11
12

[[!comment format=mdwn
 username="http://joeyh.name/"
 ip="209.250.56.96"
 subject="comment 3"
 date="2014-10-24T16:02:23Z"
 content="""
The OOM is [[S3_memory_leaks]]; fixed in the s3-aws branch.

Yeah, GET of a bucket is doable. Another problem with it though is, if the bucket has a lot of contents, such as many files, or large files split into many chunks, that all has to be buffered in memory or processed as a stream. It would make sense in operations where git-annex knows it wants to check every key in a bucket. `git annex unused --from $s3remote` is the case that springs to mind where it could be quite useful to do that. Integrating it with `get`, not so much.

I'd be inclined to demote this to a wishlist todo item to try to use bucket GET for `unused`. And/or rethink whether it makes sense for `copy --to` to run in --fast mode by default. I've been back and forth on that question before, but just from a runtime perspective, not from a 13 cents perspective. ;)
"""]]