diff options
author | Joey Hess <joeyh@joeyh.name> | 2016-09-26 15:49:42 -0400 |
---|---|---|
committer | Joey Hess <joeyh@joeyh.name> | 2016-09-26 15:49:42 -0400 |
commit | 17041a4e16931991c1794048d2574310df4af235 (patch) | |
tree | 665b6d6eb5f27b27f4a22bfb25946adc51fd1995 /doc/todo | |
parent | cebdbf41d8ff0c7d461173b72fb1cfb1f4d0e0ba (diff) |
profiling
Diffstat (limited to 'doc/todo')
-rw-r--r-- | doc/todo/make_copy_--fast__faster/comment_8_c1f99493f5e5c362d5c39f048280b11b._comment | 45 |
1 files changed, 45 insertions, 0 deletions
diff --git a/doc/todo/make_copy_--fast__faster/comment_8_c1f99493f5e5c362d5c39f048280b11b._comment b/doc/todo/make_copy_--fast__faster/comment_8_c1f99493f5e5c362d5c39f048280b11b._comment new file mode 100644 index 000000000..e0f4987a0 --- /dev/null +++ b/doc/todo/make_copy_--fast__faster/comment_8_c1f99493f5e5c362d5c39f048280b11b._comment @@ -0,0 +1,45 @@ +[[!comment format=mdwn + username="joey" + subject="""profiling""" + date="2016-09-26T19:20:36Z" + content=""" +Built git-annex with profiling, using `stack build --profile` + +(For reproduciblity, running git-annex in a clone of the git-annex repo +https://github.com/RichiH/conference_proceedings with rev +2797a49023fc24aff6fcaec55421572e1eddcfa2 checked out. It has 9496 annexed +objects.) + +Profiling `git-annex find +RTS -p`: + + total time = 3.53 secs (3530 ticks @ 1000 us, 1 processor) + total alloc = 3,772,700,720 bytes (excludes profiling overheads) + + COST CENTRE MODULE %time %alloc + + spanList Data.List.Utils 32.6 37.7 + startswith Data.List.Utils 14.3 8.1 + md5 Data.Hash.MD5 12.4 18.2 + join Data.List.Utils 6.9 13.7 + catchIO Utility.Exception 5.9 6.0 + catches Control.Monad.Catch 5.0 2.8 + inAnnex'.checkindirect Annex.Content 4.6 1.8 + readish Utility.PartialPrelude 3.0 1.4 + isAnnexLink Annex.Link 2.6 4.0 + split Data.List.Utils 1.5 0.8 + keyPath Annex.Locations 1.2 1.7 + + +This is interesting! + +Fully 40% of CPU time and allocations are in list (really String) processing, +and the details of the profiling report show that `spanList` and `startsWith` +and `join` are all coming from calls to `replace` in `keyFile` and `fileKey`. +Both functions nest several calls to replace, so perhaps that could be unwound +into a single pass and/or a ByteString used to do it more efficiently. + +12% of run time is spent calculating the md5 hashes for the hash +directories for .git/annex/objects. Data.Hash.MD5 is from missingh, and +it is probably a quite unoptimised version. Switching to the version +if cryptonite would probably speed it up a lot. +"""]] |