aboutsummaryrefslogtreecommitdiffhomepage
path: root/Eigen/src/Core/util/BlasUtil.h
Commit message (Collapse)AuthorAge
* Get rid of code duplication for conj_helper. For packets where ↵Gravatar Rasmus Munk Larsen2021-06-24
| | | | LhsType=RhsType a single generic implementation suffices. For scalars, the generic implementation of pconj automatically forwards to numext::conj, so much of the existing specialization can be avoided. For mixed types we still need specializations.
* Eliminate boolean product warnings by factoring out aGravatar Christoph Hertzberg2021-01-05
| | | `combine_scalar_factors` helper function.
* MatrixProuct enhancements:Gravatar Everton Constantino2020-09-02
| | | | | | | | | | | | | - Changes to Altivec/MatrixProduct Adapting code to gcc 10. Generic code style and performance enhancements. Adding PanelMode support. Adding stride/offset support. Enabling float64, std::complex and std::complex. Fixing lack of symm_pack. Enabling mixedtypes. - Adding std::complex tests to blasutil. - Adding an implementation of storePacketBlock when Incr!= 1.
* - Vectorizing MMA packing.Gravatar Everton Constantino2020-05-19
| | | | | - Optimizing MMA kernel. - Adding PacketBlock store to blas_data_mapper.
* bug #1741: fix C.noalias() = A*C; with C.innerStride()!=1Gravatar Gael Guennebaud2019-09-10
|
* Commas at the end of enumerator lists are not allowed in C++03Gravatar Christoph Hertzberg2019-02-19
|
* GEMM: catch all scalar-multiple variants when falling-back to a coeff-based ↵Gravatar Gael Guennebaud2019-02-18
| | | | | | | product. Before only s*A*B was caught which was both inconsistent with GEMM, sub-optimal, and could even lead to compilation-errors (https://stackoverflow.com/questions/54738495).
* Fix gebp kernel for real+complex in case only reals are vectorized (e.g., ↵Gravatar Gael Guennebaud2018-09-20
| | | | | | AVX512). This commit also removes "half-packet" from data-mappers: it was not used and conceptually broken anyways.
* Extend CUDA support to matrix inversion and selfadjointeigensolverGravatar Andrea Bocci2018-06-11
|
* Clean debugging codeGravatar Gael Guennebaud2016-12-05
|
* Complete rewrite of column-major-matrix * vector product to deliver higher ↵Gravatar Gael Guennebaud2016-12-03
| | | | | | | | | | performance of modern CPU. The previous code has been optimized for Intel core2 for which unaligned loads/stores were prohibitively expensive. This new version exhibits much higher instruction independence (better pipelining) and explicitly leverage FMA. According to my benchmark, on Haswell this new kernel is always faster than the previous one, and sometimes even twice as fast. Even higher performance could be achieved with a better blocking size heuristic and, perhaps, with explicit prefetching. We should also check triangular product/solve to optimally exploit this new kernel (working on vertical panel of 4 columns is probably not optimal anymore).
* Add generic implementation of conj_helper for custom complex types.Gravatar Gael Guennebaud2016-08-29
|
* Fix compilation in check_for_aliasing due to ambiguous specializationsGravatar Gael Guennebaud2016-08-23
|
* Permits call to explicit ctor.Gravatar Gael Guennebaud2016-07-18
|
* Cleanup unused functors.Gravatar Gael Guennebaud2016-06-14
|
* Implement scalar multiples and division by a scalar as a binary-expression ↵Gravatar Gael Guennebaud2016-06-14
| | | | | | | | | | | | with a constant expression. This slightly complexifies the type of the expressions and implies that we now have to distinguish between scalar*expr and expr*scalar to catch scalar-multiple expression (e.g., see BlasUtil.h), but this brings several advantages: - it makes it clear on each side the scalar is applied, - it clearly reflects that we are dealing with a binary-expression, - the complexity of the type is hidden through macros defined at the end of Macros.h, - distinguishing between "scalar op expr" and "expr op scalar" is important to support non commutative fields (like quaternions) - "scalar op expr" is now fully equivalent to "ConstantExpr(scalar) op expr" - scalar_multiple_op, scalar_quotient1_op and scalar_quotient2_op are not used anymore in officially supported modules (still used in Tensor)
* Introduce internal's UIntPtr and IntPtr types for pointer to integer ↵Gravatar Gael Guennebaud2016-05-26
| | | | | | | | conversions. This fixes "conversion from pointer to same-sized integral type" warnings by ICC. Ideally, we would use the std::[u]intptr_t types all the time, but since they are C99/C++11 only, let's be safe.
* Made the blas utils usable from within a cuda kernelGravatar Benoit Steiner2016-01-11
|
* Generalize first_aligned to take the requested alignment as a template ↵Gravatar Gael Guennebaud2015-08-06
| | | | parameter, and add a first_default_aligned variante calling first_aligned with the requirement of the largest packet for the given scalar type.
* bug #923: fix EIGEN_USE_BLAS modeGravatar Gael Guennebaud2015-06-23
|
* Fix MSVC compilation: aligned type must be passed by referenceGravatar Gael Guennebaud2015-03-19
|
* Packet must be passed by const reference and not by value to avoid alignment ↵Gravatar Gael Guennebaud2015-02-17
| | | | issue.
* Pulled the latest changes from the trunkGravatar Benoit Steiner2015-02-06
|\
* | Generalized the matrix vector product code.Gravatar Benoit Steiner2014-10-31
| |
* | Generalized the gebp apisGravatar Benoit Steiner2014-10-02
| |
| * Make constructors explicit if they could lead to unintended implicit conversionGravatar Christoph Hertzberg2014-09-23
|/
* merge with main branchGravatar Gael Guennebaud2013-07-17
|\
| * Fix bug #314: move remaining math functions from internal to numext namespaceGravatar Gael Guennebaud2013-06-10
| |
* | Add nvcc support for normalize, initializers, and fuzzy comparisonsGravatar Gael Guennebaud2013-06-05
|/
* Automatic relicensing to MPL2 using Keirs script. Manual fixup follows.Gravatar Benoit Jacob2012-07-13
|
* Get rid of include directives inside namespace blocks (bug #339).Gravatar Jitse Niesen2012-04-15
|
* fix conjugation in packet_lhsGravatar Gael Guennebaud2012-02-05
|
* fix several const qualifier issues: double ones, meaningless ones, some ↵Gravatar Gael Guennebaud2012-02-03
| | | | | | missing ones, etc. (note that const qualifiers are set by internall::nested)
* fix static inline versus inline static issues (the former is the correct order)Gravatar Gael Guennebaud2012-01-31
|
* Intel(R) MKL support added.Gravatar karturov2011-12-05
| | | | | | | | | | * * * License disclaimer changed to BSD license for MKL_support.h * * * Pardiso support fixed, test added. blas/lapack tests fixed: Scalar parameter was added in Cholesky, product_matrix_vector_triangular remaned to triangular_matrix_vector_product. * * * PARDISO test was added physically.
* now gemv supports stridesGravatar Gael Guennebaud2011-01-30
|
* third pass of const-correctness fixes (bug #54), hopefully the last one...Gravatar Benoit Jacob2011-01-07
|
* bug #54 - really fix const correctness except in SparseGravatar Benoit Jacob2010-12-22
|
* Initial fixes for bug #85.Gravatar Hauke Heibel2010-10-25
| | | | | | | Renamed meta_{true|false} to {true|false}_type, meta_if to conditional, is_same_type to is_same, un{ref|pointer|const} to remove_{reference|pointer|const} and makeconst to add_const. Changed boolean type 'ret' member to 'value'. Changed 'ret' members refering to types to 'type'. Adapted all code occurences.
* bug #86 : use internal:: namespace instead of ei_ prefixGravatar Benoit Jacob2010-10-25
|
* * fix SelfCwiseBinaryOp traits and handling of mixed typesGravatar Gael Guennebaud2010-07-19
| | | | * improve compilation error in case of type mismatch
* * fix a couple of remaining issues with previous commit,Gravatar Gael Guennebaud2010-07-19
| | | | * merge ei_product_blocking_traits into ei_gepb_traits
* wip: extend the gebp kernel to optimize complex and mixed productsGravatar Gael Guennebaud2010-07-19
|
* mixing types step 3:Gravatar Gael Guennebaud2010-07-11
| | | | | - improve support of colmajor by vector and matrix - matrix - now all configurations are well handled, but the perf are not always very good
* mixing types in product step 2:Gravatar Gael Guennebaud2010-07-11
| | | | | | | | * pload* and pset1 are now templated on the packet type * gemv routines are now embeded into a structure with a consistent API with respect to gemm * some configurations of vector * matrix and matrix * matrix works fine, some need more work...
* syncGravatar Gael Guennebaud2010-07-10
|\
| * * generalize rowmajor by vectorGravatar Gael Guennebaud2010-07-10
| | | | | | | | * fix weird compilation error when constructing a matrix with a row by matrix product
| * move ei_conj_if to a more appropriate fileGravatar Gael Guennebaud2010-07-09
| |
* | support for real * complex matrix product - step 1 (works for some special ↵Gravatar Gael Guennebaud2010-07-07
|/ | | | cases)
* add support for vectorized conjugated productsGravatar Gael Guennebaud2010-07-06
|