diff options
-rw-r--r-- | doc/bugs/added_branches_makes___39__git_annex_unused__39___slow/comment_5_7328bc51bd001f2b732a92a2ae175839._comment | 114 |
1 files changed, 114 insertions, 0 deletions
diff --git a/doc/bugs/added_branches_makes___39__git_annex_unused__39___slow/comment_5_7328bc51bd001f2b732a92a2ae175839._comment b/doc/bugs/added_branches_makes___39__git_annex_unused__39___slow/comment_5_7328bc51bd001f2b732a92a2ae175839._comment new file mode 100644 index 000000000..21890a4e1 --- /dev/null +++ b/doc/bugs/added_branches_makes___39__git_annex_unused__39___slow/comment_5_7328bc51bd001f2b732a92a2ae175839._comment @@ -0,0 +1,114 @@ +[[!comment format=mdwn + username="arand" + ip="130.243.226.21" + subject="comment 5" + date="2013-08-11T20:43:22Z" + content=""" +I've compared my bash/coreutils implementation mentioned above [annex-funused](https://gitorious.org/arand-scripts/arand-scripts/blobs/918cc79b99e22cbdca01ea4de10e2ca64abfc27a/annex-funused) with `git annex unused` in various situations, and from what I've seen `annex-funused` is pretty much always faster. + +In the case of no unused files they seem to be about the same. + +In all other cases there is a very considerable difference, for example, in my current main annex I get: + + $ time git annex unused + unused . (checking for unused data...) (checking master...) (checking synced/master...) (checking barracuda160G/master...) ok + + real 5m13.830s + user 2m0.444s + sys 0m28.344s + + +whereas + + $ time annex-funused + == WARNING == + This program should NOT be trusted to reliably find unused files in the + git annex. + + + real 0m1.569s + user 0m2.024s + sys 0m0.184s + +I tried to check memory usage via `/usr/bin/time -v` as well, and that showed (re-running in the same annex as above) + +annex-funused + + Maximum resident set size (kbytes): 13560 + +git annex unused + + Maximum resident set size (kbytes): 29120 + + +I've also written a comparison script [annex-testunused](https://gitorious.org/arand-scripts/arand-scripts/blobs/918cc79b99e22cbdca01ea4de10e2ca64abfc27a/annex-testunused) (needs annex-funused in $PATH) which creates an annex with a bunch of unused files and compares the running time for both versions: + +<pre> +$ annex-testunused +Initialized empty Git repository in /tmp/tmp.fmsAvsPTcd/.git/ +init ok +(Recording state in git...) +### +* b2840d7 (HEAD, master) delete ~1100 files +* c4a1e3a add 3000 files +* bc19777 (git-annex) update +* b3e6539 update +* bec2c8f branch created +annex unused +real 0m4.154s +real 0m2.029s +real 0m2.044s +annex funused +real 0m0.923s +real 0m0.933s +real 0m0.905s +Initialized empty Git repository in /tmp/tmp.7qFoCRWzB3/.git/ +init ok +(Recording state in git...) +### +* a5ff392 (HEAD, master) empty +* cca4810 (1) delete ~1100 files +* 587c406 add 3000 files +* de0afeb (git-annex) update +* 37b7881 update +* 1735062 branch created +annex unused +real 0m3.499s +real 0m3.443s +real 0m3.435s +annex funused +real 0m0.956s +real 0m0.956s +real 0m0.874s +Initialized empty Git repository in /tmp/tmp.L5fjdAgnFv/.git/ +init ok +(Recording state in git...) +### +* 94463a0 (HEAD, master) empty +* e115619 (10) empty +* 20686d4 (9) empty +* 2e01a3f (8) empty +* 043289d (7) empty +* 6a52966 (6) empty +* 0dc866d (5) empty +* 35db331 (4) empty +* 48504bc (3) empty +* e25cac7 (2) empty +* 655d026 (1) delete ~1100 files +* 91a07d1 add 3000 files +* 3c9ac62 (git-annex) update +* c5736e0 update +* 862d5b8 branch created +annex unused +real 0m16.242s +real 0m16.277s +real 0m16.246s +annex funused +real 0m0.960s +real 0m0.960s +real 0m0.927s +</pre> + +So, unless I've missed something fundamental (I keep thinking I might have...), it seems to be very consistently faster, and scale ok where `git annex unused` scales rather poorly. + +"""]] |