summaryrefslogtreecommitdiff
path: root/absl/crc
Commit message (Collapse)AuthorAge
* Fix incorrect CRC returned by AcceleratedCrcMemcpyEngine when kRegions == 1Gravatar Abseil Team2023-08-31
| | | | | | | | | | | | This bug does not affect any users currently since AcceleratedCrcMemcpyEngine is never configured with a single region currently. Before this CL, if the number of regions for the AcceleratedCrcMemcpyEngine was set to one, the CRC for the sole region would be incorrectly concatenated onto itself and corrupted. PiperOrigin-RevId: 561663848 Change-Id: Ibfc596306ab07db906d2e3ecf6eea3f6cb9f1b2b
* Add CPU detection for Ampere SirynGravatar Abseil Team2023-08-30
| | | | | PiperOrigin-RevId: 561444259 Change-Id: I205ba9f11f4d41163ce74ae9cfa417fe500ccab3
* Enable non_temporal_store_memcpy for AMD Milan, Genoa, and Ryzen 3000Gravatar Abseil Team2023-08-29
| | | | | PiperOrigin-RevId: 561119886 Change-Id: Ia1483fdb237f4b211068c7ad1f780ab3e6b81eca
* Add CPU detection for AMD Genoa and Ryzen 3000Gravatar Abseil Team2023-08-29
| | | | | PiperOrigin-RevId: 561108037 Change-Id: Idff65e288384cb55ce69f789db2d9374ae781d3d
* Use fallback engine for as the non-temporal engine for unknown CPU typesGravatar Abseil Team2023-08-29
| | | | | | | | | Using the non-temporal AVX engine for unknown CPU types looks like a mistake to me, and the default built into the switch case is to use the fallback engine. I don't think this is causing issues now, but it might once we add ARM support. PiperOrigin-RevId: 561097994 Change-Id: I7f0edd447017c09acd49e4ea11476e32740d630a
* Include what you spellGravatar Dmitri Gribenko2023-08-08
| | | | | PiperOrigin-RevId: 554936252 Change-Id: Idb2ffbbc11aa6c98414fdd1ec38873d4687ab5e7
* Implement AbslStringify for crc32c_t in order to support absl::StrFormat ↵Gravatar Abseil Team2023-08-01
| | | | | | | natively PiperOrigin-RevId: 552940359 Change-Id: I925764757404c0c9f2a13ed729190d51f4ac46cf
* Changes absl::crc32c_t insertion operator (<<) to return value as 0-padded ↵Gravatar Abseil Team2023-08-01
| | | | | | | hex instead of dec PiperOrigin-RevId: 552927211 Change-Id: I0375d60a9df4cdfc694fe8d3b3d790f80fc614a1
* Remove deprecated function.Gravatar Abseil Team2023-07-31
| | | | | PiperOrigin-RevId: 552638642 Change-Id: I6b43289ca10ee9aecd6b848e78471863b22b01d1
* Removes unused methods CRC::Empty() and CRC::Concat() from the internalGravatar Abseil Team2023-06-12
| | | | | | | implementation. PiperOrigin-RevId: 539749773 Change-Id: Iec83431ffd360a077b153cea00427580ae287d1f
* Add a declaration for __cpuid for the IntelLLVM compiler.Gravatar niranjan-nilakantan2023-05-23
| | | | | | | | | | | | Imported from GitHub PR https://github.com/abseil/abseil-cpp/pull/1452 __cpuid is declared in intrin.h, but is excluded on non-Windows platforms. We add this declaration to compensate. Fixes #1358 PiperOrigin-RevId: 534449804 Change-Id: I91027f79d8d52c4da428d5c3a53e2cec00825c13
* Rollback of add a declaration for __cpuid for the IntelLLVM compiler.Gravatar Abseil Team2023-05-22
| | | | | PiperOrigin-RevId: 534213948 Change-Id: I56b897060b9afe9d3d338756c80e52f421653b55
* Merge pull request #1452 from niranjan-nilakantan:niranjan-nilakantan/issue1358Gravatar Copybara-Service2023-05-22
| | | | | PiperOrigin-RevId: 534179290 Change-Id: I9ad24518cc6a336fbaf602269fb01319491c8b60
* Fix spelling mistakesGravatar Vertexwahn2023-05-02
|
* Merge pull request #1434 from Vertexwahn:fix-spellingGravatar Copybara-Service2023-04-25
|\ | | | | | | | | PiperOrigin-RevId: 527066823 Change-Id: Ifa1e9a43c7490b34f9f4dbfa12d3acbed6b49777
| * Fix some spelling mistakesGravatar Vertexwahn2023-04-24
|/
* Workaround for MSVC warning that designated initializers are a C++20 featureGravatar Derek Mauro2023-03-15
| | | | | | | | | | | | https://google.github.io/styleguide/cppguide.html#Designated_initializers recommends using designated initializers as does https://abseil.io/tips/172, but apparently they are a non-standard extension prior to C++20. For maximum compatibility, avoid using them here. Fixes #1413 PiperOrigin-RevId: 516892890 Change-Id: Id7b7857891e39eb52132c3edf70e5bf4973755af
* Use const and static for member functionsGravatar Rose2023-03-07
| | | | This shows that these are member functions that do not modify a class's data.
* Merge pull request #1394 from AtariDreams:constructorsGravatar Copybara-Service2023-02-21
|\ | | | | | | | | PiperOrigin-RevId: 511271203 Change-Id: I1ed352e06265b705b62d401a50b4699d01f7f1d7
| * Convert empty constructors to default onesGravatar Rose2023-02-17
| | | | | | | | These make the changed constructors match closer to the other ones that are default.
* | Prefer emplace back over push_back where emplace_back is more appropriateGravatar Rose2023-02-16
|/ | | | This also helps a lot with dealing with conversions and data structure creation under the hood.
* Don't assume that AVX implies PCLMULQDQ when using LLVM on Windows.Gravatar Saran Tunyasuvunakool2023-02-07
| | | | | PiperOrigin-RevId: 507790741 Change-Id: I347357f9a2d698510f29b7d1b065ef73f9289292
* Replace absl::base_internal::Prefetch* calls with absl::Prefetch* callsGravatar Martijn Vels2023-01-27
| | | | | PiperOrigin-RevId: 505184961 Change-Id: I64482558a76abda6896bec4b2d323833b6cd7edf
* Optimize RemoveCrc32cSuffix.Gravatar Abseil Team2023-01-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The implementation can be optimized to not having to perform an ExtendByZero operation. `RemoveCrc32cSuffix` can simply be implemented as uint32_t result = static_cast<uint32_t>(full_string_crc) ^ static_cast<uint32_t>(suffix_crc); CrcEngine()->UnextendByZeroes(&result, suffix_len); return crc32c_t{result}; Math proof that this change is correct: `ComputeCrc32c` actually computes the following: ConditionedCRC(data) = UnconditionedCRC(data) + StartValue(data) + ~0 with: StartValue(data) = ~0 * x**BitLength(data) mod P (with `+` being a carry-less add, ie an xor). ``UnconditionedCRC` in the context of this description means: no initial or final xor with ~0 and a starting value of zero - ie the result that `CrcEngine()->Extend` would give you with a starting value of 0. Given `full_string_crc` and `suffix_crc` (both conditioned CRCs), xoring them together results in: (1): full_string_crc + suffix_crc = UnconditionedCRC(full_string) + StartValue(full_string) + ~0 + UnconditionedCRC(suffix) + StartValue(suffix) + ~0 Since `+` is carry-less addition (ie an XOR), the two ~0 cancel each other out. (2) full_string_crc + suffix_crc = UnconditionedCRC(full_string) + StartValue(full_string) + UnconditionedCRC(suffix) + StartValue(suffix) We can make use of the fact that: (3) UnconditionedCRC(full_string) + UnConditionedCRC(suffix) = UnconditionedCRC(full_string_with_suffix_replaced_by_zeros). Ie, UnconditionedCRC("AABBB") + UnconditionedCRC("BBB") = UnconditionedCRC("AA\0\0\0") Putting (3) into (2) yields: (4) full_string_crc + suffix_crc = UnconditionedCRC(full_string_with_suffix_replaced_by_zeros) + StartValue(full_string) + StartValue(suffix) Using: (5) UnconditionedCRC(full_string_with_suffix_replaced_by_zeros) = UnconditionedCRC(full_string_without_suffix) * x**Bitlength(suffix) mod P and putting (5) into (4) (6) full_string_crc + suffix_crc = UnconditionedCRC(full_string_without_suffix) * x**Bitlength(suffix) mod P + StartValue(full_string) + StartValue(suffix) Using (7) StartValue(full_string) = ~0 * x ** Bitlength(full_string) mod P and (8) StartValue(suffix) = ~0 * x**BitLength(suffix) mod P Putting (7) and (8) in (6): (9): full_string_crc + suffix_crc = UnconditionedCRC(full_string_without_suffix) * x**(Bitlength(suffix)) mod P + ~0 * x ** Bitlength(full_string) mod P + ~0 * x ** BitLength(suffix) mod P Using: (10) Bitlength(full_string) = Bitlength(full_string_without_suffix) + Bitlength(suffix) And putting (10) in (9): (11) full_string_crc + suffix_crc = UnconditionedCRC(full_string_without_suffix) * x**(Bitlength(suffix)) mod P + ~0 * x ** (Bitlength(full_string_without_suffix) + Bitlength(suffix)) mod P + ~0 * x ** BitLength(suffix) mod P using x**(A+B) = x**A * x**B results in: (12) full_string_crc + suffix_crc = UnconditionedCRC(full_string_without_suffix) * x**(Bitlength(suffix)) mod P + [ ~0 * x ** Bitlength(full_string_without_suffix) * x**Bitlength(suffix)] mod P + ~0 * x ** BitLength(suffix) mod P using A mod P + B mod P + C mod P = (A + B + C) mod P: (this works in carry-less arithmetic) (13) full_string_crc + suffix_crc = [ UnconditionedCRC(full_string_without_suffix) * x**(Bitlength(suffix)) + [ ~0 * x ** Bitlength(full_string_without_suffix) * x**Bitlength(suffix)] + ~0 * x ** BitLength(suffix) ] mod P Factor out x**Bitlength(suffix): (14) full_string_crc + suffix_crc = [ x**(Bitlength(suffix)) * [ UnconditionedCRC(full_string_without_suffix) + ~0 * x ** Bitlength(full_string_without_suffix) + ~0 ] mod P Using: (15) ConditionedCRC(full_string_without_suffix) = [ UnconditionedCRC(full_string_without_suffix) + ~0 * x ** Bitlength(full_string_without_suffix) ] mod P + ~0 = [ UnconditionedCRC(full_string_without_suffix) + ~0 * x ** Bitlength(full_string_without_suffix) + ~0] mod P (~0 is less than x**32, so ~0 mod P = ~0) Putting (15) in (14) results in: full_string_crc + suffix_crc = [ x**(Bitlength(suffix)) * ConditionedCRC(full_string_without_suffix)] mod P Or: (16) ConditionedCRC(full_string_without_suffix) = (full_string_crc + suffix_crc) * x**(-Bitlength(suffix)) mod P A multiplication by x**(-8*bytelength) mod P is implemented by `CrcEngine()->UnextendByZeros`. PiperOrigin-RevId: 502659140 Change-Id: I66b0700d258f948be0885f691370b73d7fad56e3
* Don't use Arm vector intrinsics when compiling with CUDA in device mode.Gravatar Abseil Team2023-01-11
| | | | | PiperOrigin-RevId: 501464530 Change-Id: I5a0929a2b88c1c158b1696634a65ffda9c4b8590
* Require 64-bit builds on x86 to use AcceleratedCrcMemcpyEngineGravatar Derek Mauro2023-01-05
| | | | | | | | This also ensures that there is only one definition of GetArchSpecificEngines by moving the condition to a common place. PiperOrigin-RevId: 500038304 Change-Id: If0c55d701dfdc11a1a9c8c1b34eb220435529ffb
* Require 64-bit builds on x86 to use CRC32 hardware accelerationGravatar Derek Mauro2023-01-04
| | | | | | | | 32-bit builds with SSE 4.2 do exist, and these builds do not work without this patch. PiperOrigin-RevId: 499498979 Change-Id: I0ade09068804655652c07d0f1ef13554464a1558
* Add prefetch to crc32Gravatar Ilya Tokar2022-12-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We already prefetch in case of large inputs, do the same for medium sized inputs as well. This is mostly neutral for performance in most cases, so this also adds a new bench with working size >> cache size to ensure that we are seeing performance benefits of prefetch. Main benefits are on AMD with hardware prefetchers turned off: AMD prefetchers on: name old time/op new time/op delta BM_Calculate/0 2.43ns ± 1% 2.43ns ± 1% ~ (p=0.814 n=40+40) BM_Calculate/1 2.50ns ± 2% 2.50ns ± 2% ~ (p=0.745 n=39+39) BM_Calculate/100 9.17ns ± 1% 9.17ns ± 2% ~ (p=0.747 n=40+40) BM_Calculate/10000 474ns ± 1% 474ns ± 2% ~ (p=0.749 n=40+40) BM_Calculate/500000 22.8µs ± 1% 22.9µs ± 2% ~ (p=0.298 n=39+40) BM_Extend/0 1.38ns ± 1% 1.38ns ± 1% ~ (p=0.651 n=40+40) BM_Extend/1 1.53ns ± 2% 1.53ns ± 1% ~ (p=0.957 n=40+39) BM_Extend/100 9.48ns ± 1% 9.48ns ± 2% ~ (p=1.000 n=40+40) BM_Extend/10000 474ns ± 2% 474ns ± 1% ~ (p=0.928 n=40+40) BM_Extend/500000 22.8µs ± 1% 22.9µs ± 2% ~ (p=0.331 n=40+40) BM_Extend/100000000 4.79ms ± 1% 4.79ms ± 1% ~ (p=0.753 n=38+38) BM_ExtendCacheMiss/10 25.5ms ± 2% 25.5ms ± 2% ~ (p=0.988 n=38+40) BM_ExtendCacheMiss/100 23.1ms ± 2% 23.1ms ± 2% ~ (p=0.792 n=40+40) BM_ExtendCacheMiss/1000 37.2ms ± 1% 28.6ms ± 2% -23.00% (p=0.000 n=38+40) BM_ExtendCacheMiss/100000 7.77ms ± 2% 7.74ms ± 2% -0.45% (p=0.006 n=40+40) AMD prefetchers off: name old time/op new time/op delta BM_Calculate/0 2.43ns ± 2% 2.43ns ± 2% ~ (p=0.351 n=40+39) BM_Calculate/1 2.51ns ± 2% 2.51ns ± 1% ~ (p=0.535 n=40+40) BM_Calculate/100 9.18ns ± 2% 9.15ns ± 2% ~ (p=0.120 n=38+39) BM_Calculate/10000 475ns ± 2% 475ns ± 2% ~ (p=0.852 n=40+40) BM_Calculate/500000 22.9µs ± 2% 22.8µs ± 2% ~ (p=0.396 n=40+40) BM_Extend/0 1.38ns ± 2% 1.38ns ± 2% ~ (p=0.466 n=40+40) BM_Extend/1 1.53ns ± 2% 1.53ns ± 2% ~ (p=0.914 n=40+39) BM_Extend/100 9.49ns ± 2% 9.49ns ± 2% ~ (p=0.802 n=40+40) BM_Extend/10000 475ns ± 2% 474ns ± 1% ~ (p=0.589 n=40+40) BM_Extend/500000 22.8µs ± 2% 22.8µs ± 2% ~ (p=0.872 n=39+40) BM_Extend/100000000 10.0ms ± 3% 10.0ms ± 4% ~ (p=0.355 n=40+40) BM_ExtendCacheMiss/10 196ms ± 2% 196ms ± 2% ~ (p=0.698 n=40+40) BM_ExtendCacheMiss/100 129ms ± 1% 129ms ± 1% ~ (p=0.602 n=36+37) BM_ExtendCacheMiss/1000 88.6ms ± 1% 57.2ms ± 1% -35.49% (p=0.000 n=36+38) BM_ExtendCacheMiss/100000 14.9ms ± 1% 14.9ms ± 1% ~ (p=0.888 n=39+40) Intel skylake: BM_Calculate/0 2.49ns ± 2% 2.44ns ± 4% -2.15% (p=0.001 n=31+34) BM_Calculate/1 3.04ns ± 2% 2.98ns ± 9% -1.95% (p=0.003 n=31+35) BM_Calculate/100 8.64ns ± 3% 8.53ns ± 5% ~ (p=0.065 n=31+35) BM_Calculate/10000 290ns ± 3% 285ns ± 7% -1.80% (p=0.004 n=28+34) BM_Calculate/500000 11.8µs ± 2% 11.6µs ± 8% -1.59% (p=0.003 n=26+34) BM_Extend/0 1.56ns ± 1% 1.52ns ± 3% -2.44% (p=0.000 n=26+35) BM_Extend/1 1.88ns ± 3% 1.83ns ± 6% -2.17% (p=0.001 n=27+35) BM_Extend/100 9.31ns ± 3% 9.13ns ± 7% -1.92% (p=0.000 n=33+38) BM_Extend/10000 290ns ± 3% 283ns ± 3% -2.45% (p=0.000 n=32+38) BM_Extend/500000 11.8µs ± 2% 11.5µs ± 8% -1.80% (p=0.001 n=35+37) BM_Extend/100000000 6.39ms ±10% 6.11ms ± 8% -4.34% (p=0.000 n=40+40) BM_ExtendCacheMiss/10 36.2ms ± 7% 35.8ms ±14% ~ (p=0.281 n=33+37) BM_ExtendCacheMiss/100 26.9ms ±15% 25.9ms ±12% -3.93% (p=0.000 n=40+40) BM_ExtendCacheMiss/1000 23.8ms ± 5% 23.4ms ± 5% -1.68% (p=0.001 n=39+40) BM_ExtendCacheMiss/100000 10.1ms ± 5% 10.0ms ± 4% ~ (p=0.051 n=39+39) PiperOrigin-RevId: 495119444 Change-Id: I67bcf3b0282b5e1c43122de2837a24c16b8aded7
* Add a define for HWCAP_CPUID on platforms that are missing itGravatar Derek Mauro2022-12-12
| | | | | PiperOrigin-RevId: 494749165 Change-Id: I8d855be9c508a9fdfb5f60e87471c0947057ecc9
* Allow Cord to store chunked checksumsGravatar Derek Mauro2022-12-11
| | | | | PiperOrigin-RevId: 494587777 Change-Id: I41504edca6fcf750d52602fa84a33bc7fe5fbb48
* Merge pull request #1338 from MBkkt:patch-5Gravatar Copybara-Service2022-12-06
|\ | | | | | | | | PiperOrigin-RevId: 493386604 Change-Id: I289cb38b4a3da5760ab7ef3976d402d165d7e10f
| * Update non_temporal_memcpy.hGravatar Valery Mironov2022-12-06
|/
* Fixes many compilation issues that come from having no external CIGravatar Derek Mauro2022-11-30
| | | | | | | | | | | | | | | | | | coverage of the accelerated CRC implementation and some differences bewteen the internal and external implementation. This change adds CI coverage to the linux_clang-latest_libstdcxx_bazel.sh script assuming this script always runs on machines of at least the Intel Haswell generation. Fixes include: * Remove the use of the deprecated xor operator on crc32c_t * Remove #pragma unroll_completely, which isn't known by GCC or Clang: https://godbolt.org/z/97j4vbacs * Fixes for -Wsign-compare, -Wsign-conversion and -Wshorten-64-to-32 PiperOrigin-RevId: 491965029 Change-Id: Ic5e1f3a20f69fcd35fe81ebef63443ad26bf7931
* Remove unused iostream include from crc32c.hGravatar Derek Mauro2022-11-29
| | | | | PiperOrigin-RevId: 491722639 Change-Id: Iff13661095d10c82599ad30f7220700825a78c9e
* Fix MSVC builds that reject C-style arrays of size 0Gravatar Derek Mauro2022-11-29
| | | | | | | | | | std::array has a special-case to allow this https://en.cppreference.com/w/cpp/container/array Fixes #1332 PiperOrigin-RevId: 491703960 Change-Id: Ib83a1f0865448314e463e8ebf39ae3b842f762ea
* Remove deprecated use of absl::ToCrc32c()Gravatar Derek Mauro2022-11-29
| | | | | PiperOrigin-RevId: 491681300 Change-Id: I4ecdd3bf359cda7592b6c392a2fbb61b8394f71b
* CRC: Make crc32c_t as a class for explicit control of operatorsGravatar Derek Mauro2022-11-29
| | | | | | | | | The motivation is to explicitly remove and document dangerous operations like adding crc32c_t to a set, because equality is not enough to guarantee uniqueness. PiperOrigin-RevId: 491656425 Change-Id: I7b4dadc1a59ea9861e6ec7a929d64b5746467832
* Avoid using the non-portable type __m128i_u.Gravatar Derek Mauro2022-11-28
| | | | | | | | | | | | According to https://stackoverflow.com/a/68939636 it is safe to use __m128i instead. https://learn.microsoft.com/en-us/cpp/intrinsics/x86-intrinsics-list?view=msvc-170 also uses this type instead Fixes #1330 PiperOrigin-RevId: 491427300 Change-Id: I4a1d44ac4d5e7c1e1ee063ff397935df118254a1
* Use ABSL_HAVE_BUILTIN to fix -Wundef __has_builtin warningGravatar Derek Mauro2022-11-28
| | | | | | | Fixes #1329 PiperOrigin-RevId: 491372279 Change-Id: I93c094b06ece9cb9bdb39fd4541353e0344a1a57
* Fix AMD cpu detection.Gravatar Ilya Tokar2022-11-23
| | | | | | | | | | | | | | Currently we take generic/default code-path on AMD due to misspelling. Mostly helps with crc+memcpy: name old speed new speed delta BM_Memcpy/1 156MB/s ± 1% 156MB/s ± 1% ~ (p=0.563 n=18+18) BM_Memcpy/100 6.38GB/s ± 1% 6.50GB/s ± 1% +1.89% (p=0.000 n=19+19) BM_Memcpy/10000 14.6GB/s ± 1% 21.7GB/s ± 0% +49.01% (p=0.000 n=20+19) BM_Memcpy/500000 13.5GB/s ± 1% 19.9GB/s ± 0% +47.35% (p=0.000 n=18+17) PiperOrigin-RevId: 490572650 Change-Id: Id7901321a23262c0ab62a2d82fae86cf42acf16d
* CRC: Get CPU detection and hardware acceleration working on MSVC x86(_64)Gravatar Derek Mauro2022-11-23
| | | | | | | Using /arch:AVX on MSVC now uses the accelerated implementation PiperOrigin-RevId: 490550573 Change-Id: I924259845f38ee41d15f23f95ad085ad664642b5
* Use the correct Bazel copts in crc targetsGravatar Derek Mauro2022-11-14
| | | | | PiperOrigin-RevId: 488373221 Change-Id: I1e30820188cc860ce4df8fddafa04de343ec46af
* CRC: Ensure SupportsArmCRC32PMULL() is definedGravatar Derek Mauro2022-11-10
| | | | | | | | | | | | This fixes the build on arm64 macOS. Note that hardware acceleration is not yet enabled on arm64 when not running under Linux. Addresses the report from https://github.com/abseil/abseil-cpp/commit/1687dbf814eceb93de2d93f91b31acaab404091c#commitcomment-89529264 PiperOrigin-RevId: 487655295 Change-Id: I168dfc863c960d0b694b26dfcb85ff0fd0e95a1e
* Release the CRC libraryGravatar Derek Mauro2022-11-09
This implementation can advantage of hardware acceleration available on common CPUs when using GCC and Clang. A future update may enable this on MSVC as well. PiperOrigin-RevId: 487327024 Change-Id: I99a8f1bcbdf25297e776537e23bd0a902e0818a1