aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGravatar Joey Hess <joeyh@joeyh.name>2016-09-26 15:49:42 -0400
committerGravatar Joey Hess <joeyh@joeyh.name>2016-09-26 15:49:42 -0400
commit17041a4e16931991c1794048d2574310df4af235 (patch)
tree665b6d6eb5f27b27f4a22bfb25946adc51fd1995
parentcebdbf41d8ff0c7d461173b72fb1cfb1f4d0e0ba (diff)
profiling
-rw-r--r--doc/todo/make_copy_--fast__faster/comment_8_c1f99493f5e5c362d5c39f048280b11b._comment45
1 files changed, 45 insertions, 0 deletions
diff --git a/doc/todo/make_copy_--fast__faster/comment_8_c1f99493f5e5c362d5c39f048280b11b._comment b/doc/todo/make_copy_--fast__faster/comment_8_c1f99493f5e5c362d5c39f048280b11b._comment
new file mode 100644
index 000000000..e0f4987a0
--- /dev/null
+++ b/doc/todo/make_copy_--fast__faster/comment_8_c1f99493f5e5c362d5c39f048280b11b._comment
@@ -0,0 +1,45 @@
+[[!comment format=mdwn
+ username="joey"
+ subject="""profiling"""
+ date="2016-09-26T19:20:36Z"
+ content="""
+Built git-annex with profiling, using `stack build --profile`
+
+(For reproduciblity, running git-annex in a clone of the git-annex repo
+https://github.com/RichiH/conference_proceedings with rev
+2797a49023fc24aff6fcaec55421572e1eddcfa2 checked out. It has 9496 annexed
+objects.)
+
+Profiling `git-annex find +RTS -p`:
+
+ total time = 3.53 secs (3530 ticks @ 1000 us, 1 processor)
+ total alloc = 3,772,700,720 bytes (excludes profiling overheads)
+
+ COST CENTRE MODULE %time %alloc
+
+ spanList Data.List.Utils 32.6 37.7
+ startswith Data.List.Utils 14.3 8.1
+ md5 Data.Hash.MD5 12.4 18.2
+ join Data.List.Utils 6.9 13.7
+ catchIO Utility.Exception 5.9 6.0
+ catches Control.Monad.Catch 5.0 2.8
+ inAnnex'.checkindirect Annex.Content 4.6 1.8
+ readish Utility.PartialPrelude 3.0 1.4
+ isAnnexLink Annex.Link 2.6 4.0
+ split Data.List.Utils 1.5 0.8
+ keyPath Annex.Locations 1.2 1.7
+
+
+This is interesting!
+
+Fully 40% of CPU time and allocations are in list (really String) processing,
+and the details of the profiling report show that `spanList` and `startsWith`
+and `join` are all coming from calls to `replace` in `keyFile` and `fileKey`.
+Both functions nest several calls to replace, so perhaps that could be unwound
+into a single pass and/or a ByteString used to do it more efficiently.
+
+12% of run time is spent calculating the md5 hashes for the hash
+directories for .git/annex/objects. Data.Hash.MD5 is from missingh, and
+it is probably a quite unoptimised version. Switching to the version
+if cryptonite would probably speed it up a lot.
+"""]]