summaryrefslogtreecommitdiff
path: root/doc/benchmarking/comment_10_1af4ac0d37c876912678522895c1656b._comment
diff options
context:
space:
mode:
Diffstat (limited to 'doc/benchmarking/comment_10_1af4ac0d37c876912678522895c1656b._comment')
-rw-r--r--doc/benchmarking/comment_10_1af4ac0d37c876912678522895c1656b._comment61
1 files changed, 61 insertions, 0 deletions
diff --git a/doc/benchmarking/comment_10_1af4ac0d37c876912678522895c1656b._comment b/doc/benchmarking/comment_10_1af4ac0d37c876912678522895c1656b._comment
new file mode 100644
index 000000000..868b10364
--- /dev/null
+++ b/doc/benchmarking/comment_10_1af4ac0d37c876912678522895c1656b._comment
@@ -0,0 +1,61 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""comment 10"""
+ date="2016-09-29T18:33:33Z"
+ content="""
+* Optimised key2file and file2key. 18% scanning time speedup.
+* Optimised adjustGitEnv. 50% git-annex branch query speedup
+* Optimised parsePOSIXTime. 10% git-annex branch query speedup
+* Tried making catObjectDetails.receive use ByteString for parsing,
+ but that did not seem to speed it up significantly.
+ So it parsing is already fairly optimal, it's just that a
+ lot of data passes through it when querying the git-annex
+ branch.
+
+After all that, profiling `git-annex find`:
+
+ Thu Sep 29 16:51 2016 Time and Allocation Profiling Report (Final)
+
+ git-annex.1 +RTS -p -RTS find
+
+ total time = 1.73 secs (1730 ticks @ 1000 us, 1 processor)
+ total alloc = 1,812,406,632 bytes (excludes profiling overheads)
+
+ COST CENTRE MODULE %time %alloc
+
+ md5 Data.Hash.MD5 28.0 37.9
+ catchIO Utility.Exception 10.2 12.5
+ inAnnex'.checkindirect Annex.Content 9.9 3.7
+ catches Control.Monad.Catch 8.7 5.7
+ readish Utility.PartialPrelude 5.7 3.0
+ isAnnexLink Annex.Link 5.0 8.4
+ keyFile Annex.Locations 4.2 5.8
+ spanList Data.List.Utils 4.0 6.3
+ startswith Data.List.Utils 2.0 1.3
+
+And `git-annex find --not --in web`:
+
+ Thu Sep 29 16:35 2016 Time and Allocation Profiling Report (Final)
+
+ git-annex +RTS -p -RTS find --not --in web
+
+ total time = 5.24 secs (5238 ticks @ 1000 us, 1 processor)
+ total alloc = 3,293,314,472 bytes (excludes profiling overheads)
+
+ COST CENTRE MODULE %time %alloc
+
+ catObjectDetails.receive Git.CatFile 12.9 5.5
+ md5 Data.Hash.MD5 10.6 20.8
+ readish Utility.PartialPrelude 7.3 8.2
+ catchIO Utility.Exception 6.7 7.3
+ spanList Data.List.Utils 4.1 7.4
+ readFileStrictAnyEncoding Utility.Misc 3.5 1.3
+ catches Control.Monad.Catch 3.3 3.2
+
+So, quite a large speedup overall!
+
+This leaves md5 still unoptimised at 10-28% of CPU use. I looked at switching
+it to cryptohash's implementation, but it would require quite a lot of
+bit-banging math to pull the used values out of the ByteString containing
+the md5sum.
+"""]]