diff options
author | Pierre-Marie Pédrot <pierre-marie.pedrot@inria.fr> | 2017-02-16 12:26:13 +0100 |
---|---|---|
committer | Pierre-Marie Pédrot <pierre-marie.pedrot@inria.fr> | 2017-02-19 20:11:50 +0100 |
commit | 2b8ad7e04002ebe9fec5790da924673418f2fa7f (patch) | |
tree | b758d62d16e90d05fa56516fed81604c8d873ac3 /kernel/cClosure.ml | |
parent | 7707396c5010d88c3d0be6ecee816d8da7ed0ee0 (diff) |
Optimizing array mapping in the kernel.
We unroll the map operation by hand in two performance-critical cases so as
not to call the generic array allocation function in OCaml, and allocate
directly on the minor heap instead. The generic array function is slow because
it needs to discriminate between float and non-float arrays. The unrolling
replaces this by a simple increment to the minor heap pointer and moves from
the stack.
The quantity of unrolling was determined by experimental measures on various Coq
developments. It looks like most of the maps are for small arrays of size lesser
or equal to 4, so this is what is implemented here. We could probably extend it
to an even bigger number, but that would result in ugly code. From what I've seen,
virtually all maps are of size less than 16, so that we could probably be almost
optimal by going up to 16 unrollings, but the code tradeoffs are not obvious. Maybe
when we have PPX?
Diffstat (limited to 'kernel/cClosure.ml')
-rw-r--r-- | kernel/cClosure.ml | 11 |
1 files changed, 10 insertions, 1 deletions
diff --git a/kernel/cClosure.ml b/kernel/cClosure.ml index fe9ec5794..b1dd26119 100644 --- a/kernel/cClosure.ml +++ b/kernel/cClosure.ml @@ -540,7 +540,16 @@ let mk_clos e t = | (CoFix _|Lambda _|Fix _|Prod _|Evar _|App _|Case _|Cast _|LetIn _|Proj _) -> {norm = Red; term = FCLOS(t,e)} -let mk_clos_vect env v = CArray.Fun1.map mk_clos env v +(** Hand-unrolling of the map function to bypass the call to the generic array + allocation *) +let mk_clos_vect env v = match v with +| [||] -> [||] +| [|v0|] -> [|mk_clos env v0|] +| [|v0; v1|] -> [|mk_clos env v0; mk_clos env v1|] +| [|v0; v1; v2|] -> [|mk_clos env v0; mk_clos env v1; mk_clos env v2|] +| [|v0; v1; v2; v3|] -> + [|mk_clos env v0; mk_clos env v1; mk_clos env v2; mk_clos env v3|] +| v -> CArray.Fun1.map mk_clos env v (* Translate the head constructor of t from constr to fconstr. This function is parameterized by the function to apply on the direct |