From e5a7979d36a84831652abf3138c4d76797c002b5 Mon Sep 17 00:00:00 2001 From: Abseil Team Date: Mon, 28 Nov 2022 12:27:06 -0800 Subject: Reduce flat_hash_{set,map} generated code size. MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This CL makes a bunch of changes (mostly to raw_hash_set which underlies flat_hash_set and flat_hash_map). Techniques used: * Extract code that does not depend on the specific hash table type into common (non-inlined) functions. * Place ABSL_ATTRIBUTE_NOINLINE directives judiciously. * Out-of-line some slow paths. Reduces sizes of some large binaries by ~0.5%. Has no significant performance impact on a few performance critical binaries. ## Speed of fleetbench micro-benchmarks Following is a histogram of %-age changes in [fleetbench](https://github.com/google/fleetbench) hot_swissmap_benchmark results. Negative numbers indicate a speedup caused by this change. Statistically insignificant changes are mapped to zero. XXX Also run and merge in cold_swissmap_benchmark Across all 351 benchmarks, the average speedup is 0.38%. The best speedup was -25%, worst slowdown was +6.81%. ``` Count: 351 Average: -0.382764 StdDev: 3.77807 Min: -25 Median: 0.435135 Max: 6.81 --------------------------------------------- [ -25, -10) 16 4.558% 4.558% # [ -9, -8) 2 0.570% 5.128% [ -8, -7) 1 0.285% 5.413% [ -7, -6) 1 0.285% 5.698% [ -6, -5) 2 0.570% 6.268% [ -5, -4) 5 1.425% 7.692% [ -4, -3) 13 3.704% 11.396% # [ -3, -2) 15 4.274% 15.670% # [ -2, -1) 26 7.407% 23.077% ## [ -1, 0) 14 3.989% 27.066% # [ 0, 1) 185 52.707% 79.772% ############ [ 1, 2) 14 3.989% 83.761% # [ 2, 3) 8 2.279% 86.040% # [ 3, 4) 7 1.994% 88.034% [ 4, 5) 32 9.117% 97.151% ## [ 5, 6) 6 1.709% 98.860% [ 6, 7) 4 1.140% 100.000% ``` We looked at the slowdowns and they do not seem worth worrying about. E.g., the worst one was: ``` BM_FindHit_Hot<::absl::node_hash_set,64>/set_size:4096/density:0 2.61ns ± 1% 2.79ns ± 1% +6.81% (p=0.008 n=5+5) ``` ## Detailed changes * Out-of-line slow paths in hash table sampler methods. * Explicitly unregister from sampler instead of from destructor. * Introduced a non-templated CommonFields struct that holds some of the hash table fields (infoz, ctrl, slots, size, capacity). This struct can be passed to new non-templated helpers. The struct is a private base class of raw_hash_set. * Made non-inlined InitializeSlots<> that is only templated on allocator and size/alignment of the slot type so that we can share instantiations across types that have the same size/alignment. * Moved some infrequently called code paths into non-inlined type-erased. functions. Pass a suite of type-specific function pointers to these routines for when they need to operate on slots. * Marked some methods as non-inlined. * Avoid unnecessary reinitialization in destructor. * Introduce UpdateSpine type-erased helper that is called from clear() and rehash(). PiperOrigin-RevId: 491413386 Change-Id: Ia5495c5a6ec73622a785a0d260e406ddb9085a7c --- absl/container/internal/raw_hash_set_test.cc | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'absl/container/internal/raw_hash_set_test.cc') diff --git a/absl/container/internal/raw_hash_set_test.cc b/absl/container/internal/raw_hash_set_test.cc index daa32814..74351c09 100644 --- a/absl/container/internal/raw_hash_set_test.cc +++ b/absl/container/internal/raw_hash_set_test.cc @@ -56,8 +56,8 @@ namespace container_internal { struct RawHashSetTestOnlyAccess { template - static auto GetSlots(const C& c) -> decltype(c.slots_) { - return c.slots_; + static auto GetSlots(const C& c) -> decltype(c.slot_array()) { + return c.slot_array(); } }; @@ -455,12 +455,12 @@ TEST(Table, EmptyFunctorOptimization) { static_assert(std::is_empty>::value, ""); struct MockTable { + void* infoz; void* ctrl; void* slots; size_t size; size_t capacity; size_t growth_left; - void* infoz; }; struct MockTableInfozDisabled { void* ctrl; -- cgit v1.2.3