| Commit message (Collapse) | Author | Age |
|
|
|
|
|
| |
For empty or single-column matrices, the current `PartialPivLU`
currently dereferences a `nullptr` or accesses memory out-of-bounds.
Here we adjust the checks to avoid this.
|
|
|
|
|
|
|
|
|
| |
We can't make guarantees on alignment for existing calls to `pset`,
so we should default to loading unaligned. But in that case, we should
just use `ploadu` directly. For loading constants, this load should hopefully
get optimized away.
This is causing segfaults in Google Maps.
|
|
|
|
| |
optimization changing sign with --ffast-math enabled.
|
| |
|
|
|
|
| |
Also, remove unnecessary `pgather` operations.
|
|
|
|
|
|
| |
innerStride(), outerStride(), and size()""
This reverts commit 5f0b4a4010af4cbf6161a0d1a03a747addc44a5d.
|
|
|
|
|
|
|
| |
innerStride(), outerStride(), and size()"
This reverts commit 6cbb3038ac48cb5fe17eba4dfbf26e3e798041f1 because it
breaks clang-10 builds on x86 and aarch64 when C++11 is enabled.
|
|
|
|
| |
outerStride(), and size()
|
|
|
|
|
| |
Causing build breakages due to `-Wnewline-eof -Werror` that seems to be
common across Google.
|
|
|
|
| |
using PacketMath.
|
| |
|
| |
|
| |
|
|
|
|
| |
Packet2d is not supported.
|
|
|
|
|
|
|
|
|
|
|
| |
Implemented fast size-4 matrix inverse (mimicking Inverse_SSE.h) using NEON intrinsics.
```
Benchmark Time CPU Time Old Time New CPU Old CPU New
--------------------------------------------------------------------------------------------------------
BM_float -0.1285 -0.1275 568 495 572 499
BM_double -0.2265 -0.2254 638 494 641 496
```
|
| |
|
| |
|
|
|
|
| |
a /= b does a/b and not a * (1/b) as it was a long time ago...)
|
|
|
|
| |
Not sure that's so critical, but this does not complexify the code base much.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
available.
This change set also makes a better use of Map<>+OuterStride and Ref<> yielding surprising speed up for small dynamic sizes as well.
The table below reports times in micro seconds for 10 random matrices:
| ------ float --------- | ------- double ------- |
size | before after ratio | before after ratio |
fixed 1 | 0.34 0.11 2.93 | 0.35 0.11 3.06 |
fixed 2 | 0.81 0.24 3.38 | 0.91 0.25 3.60 |
fixed 3 | 1.49 0.49 3.04 | 1.68 0.55 3.01 |
fixed 4 | 2.31 0.70 3.28 | 2.45 1.08 2.27 |
fixed 5 | 3.49 1.11 3.13 | 3.84 2.24 1.71 |
fixed 6 | 4.76 1.64 2.88 | 4.87 2.84 1.71 |
dyn 1 | 0.50 0.40 1.23 | 0.51 0.40 1.26 |
dyn 2 | 1.08 0.85 1.27 | 1.04 0.69 1.49 |
dyn 3 | 1.76 1.26 1.40 | 1.84 1.14 1.60 |
dyn 4 | 2.57 1.75 1.46 | 2.67 1.66 1.60 |
dyn 5 | 3.80 2.64 1.43 | 4.00 2.48 1.61 |
dyn 6 | 5.06 3.43 1.47 | 5.15 3.21 1.60 |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This changeset also includes:
* add HouseholderSequence::conjugateIf
* define int as the StorageIndex type for all dense solvers
* dedicated unit tests, including assertion checking
* _check_solve_assertion(): this method can be implemented in derived solver classes to implement custom checks
* CompleteOrthogonalDecompositions: add applyZOnTheLeftInPlace, fix scalar type in applyZAdjointOnTheLeftInPlace(), add missing assertions
* Cholesky: add missing assertions
* FullPivHouseholderQR: Corrected Scalar type in _solve_impl()
* BDCSVD: Unambiguous return type for ternary operator
* SVDBase: Corrected Scalar type in _solve_impl()
|
|
|
|
| |
and make PartialPivLU use it.
|
| |
|
| |
|
| |
|
|
|
|
| |
differentiation in Ceres breaks because Scalar is a custom type that does not support multiplication by Index.
|
| |
|
| |
|
|
|
|
| |
Found using `codespell` and `grep` from downstream FreeCAD
|
|
|
|
|
| |
* they're used consistently between the declaration and the definition of a function
* we avoid calling host only methods from host device methods.
|
|
|
|
| |
resizing of the destination after the creation of the evaluator of the source expression.
|
|
|
|
| |
partialPivLu(), and inverse() functions since they aren't ready to run on GPU
|
| |
|
|
|
|
| |
00c29c2caef8fb0c6b1d2ba5ecdf6780c0c766d4
|
|
|
|
| |
(those used to break old nvcc versions that we propably don't care anymore)
|
| |
|
|
|
|
| |
install(DIRECTORY ...) command.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
decompositions.
|
|
|
|
|
|
|
|
|
|
| |
- Replace internal::scalar_product_traits<A,B> by Eigen::ScalarBinaryOpTraits<A,B,OP>
- Remove the "functor_is_product_like" helper (was pretty ugly)
- Currently, OP is not used, but it is available to the user for fine grained tuning
- Currently, only the following operators have been generalized: *,/,+,-,=,*=,/=,+=,-=
- TODO: generalize all other binray operators (comparisons,pow,etc.)
- TODO: handle "scalar op array" operators (currently only * is handled)
- TODO: move the handling of the "void" scalar type to ScalarBinaryOpTraits
|
| |
|
|
|
|
| |
This also fixes some long to float conversion warnings
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
* Get rid of code-duplication for real vs. complex matrices.
* Fix flipped arguments to select.
* Make the condition estimation functions free functions.
* Use Vector::Unit() to generate canonical unit vectors.
* Misc. cleanup.
|