Commit message (Collapse) | Author | Age | ||
---|---|---|---|---|
... | ||||
* | bug fix forgot to conjugate the scalar factor when needed | 2010-07-06 | ||
| | ||||
* | reduce code generation and minor speed up | 2010-07-06 | ||
| | ||||
* | * extend the Has* packet traits and makes all functor use it | 2010-07-05 | ||
| | | | | * extend the packing routines to support conjugation | |||
* | fix openmp for row major destination | 2010-07-03 | ||
| | ||||
* | fix bug with openmp | 2010-07-03 | ||
| | ||||
* | s/struct/class/g ; bug reported by Thomas | 2010-06-30 | ||
| | ||||
* | fox blcok size computation for fixed size objects | 2010-06-25 | ||
| | ||||
* | email change | 2010-06-24 | ||
| | ||||
* | - add a low level mechanism to provide preallocated memory to gemm | 2010-06-24 | ||
| | | | | - ensure static allocation for the product of "large" fixed size matrix | |||
* | * makes all product use the new API to set the blocking sizes | 2010-06-22 | ||
| | | | | * fix an issue preventing multithreading (now Dynamic = -1 ...) | |||
* | simplify and optimize block sizes computation for matrix products. They | 2010-06-21 | ||
| | | | | | are now automatically computed from the L1 and L2 cache sizes which are themselves automatically determined at runtime. | |||
* | add runtime API to control multithreading | 2010-06-10 | ||
| | ||||
* | make the cache size mechanism future proof by adding level 2 parameters | 2010-06-10 | ||
| | ||||
* | Made the supression of unused variables portable. | 2010-06-08 | ||
| | | | | EIGEN_UNUSED is not supported on non GCC systems. | |||
* | remove ei_ prefix of public global functions, and s/cpu/l1 | 2010-06-07 | ||
| | ||||
* | Add a proof concept API to configure the blocking parameters at runtime. | 2010-06-07 | ||
| | | | | After validation of the final API I'll update the other products to use it. | |||
* | clean old stuff used to support precompilation inside a binary lib | 2010-06-07 | ||
| | ||||
* | the Index types change. | 2010-05-30 | ||
| | | | | As discussed on the list (too long to explain here). | |||
* | clang shocks on this. | 2010-05-21 | ||
| | | | | | | | | | | | | | | According to people on #llvm, this is indeed not allowed by c++ standard: [01:33] <coppro> what good would mutable do on a reference? [01:33] <dgregor> orzel: gcc is wrong to allow "mutable" on references [01:33] <coppro> just remove mutable; it won't damage the code at all [01:34] <dgregor> "The mutable specifier can be applied only to names of class data members (9.2) and cannot be applied to [01:34] <dgregor> names declared const or static, and cannot be applied to reference members." [01:34] <coppro> constness is not passed from an object to the referents of its members anyways | |||
* | stride() => inner/outerStride() | 2010-03-06 | ||
| | ||||
* | fix openmp version for scalar types different than float | 2010-03-05 | ||
| | ||||
* | remove the 1D and 2D parallelizer, keep only the GEMM specialized one | 2010-03-05 | ||
| | ||||
* | minor cleaning | 2010-03-05 | ||
| | ||||
* | remove Qt's atomic dependency, I don't know what I was doing wrong... | 2010-03-01 | ||
| | ||||
* | GEMM: move the first packing of A' before the packing of B' | 2010-03-01 | ||
| | ||||
* | make Aron's idea work using Qt's atomic implementation for the synchronisation | 2010-03-01 | ||
| | ||||
* | implement Aron's idea of interleaving the packing with the first computations | 2010-02-26 | ||
| | ||||
* | fix compilation without openmp | 2010-02-26 | ||
| | ||||
* | implement a smarter parallelization strategy for gemm avoiding multiple | 2010-02-26 | ||
| | | | | paking of the same data | |||
* | add a 2D parallelizer | 2010-02-23 | ||
| | ||||
* | fix typo | 2010-02-23 | ||
| | ||||
* | significant speedup in the matrix-matrix products | 2010-02-23 | ||
| | ||||
* | clean a bit the parallelizer | 2010-02-22 | ||
| | ||||
* | add initial openmp support for matrix-matrix products | 2010-02-22 | ||
| | | | | => x1.9 speedup on my core2 duo | |||
* | update mixingtype unit test to reflect current status, but it is still clear | 2009-09-03 | ||
| | | | | we should allow matrix products between complex and real ? | |||
* | Fix serious bug discovered with gcc 4.2 | 2009-09-03 | ||
| | ||||
* | overload operartor* with a ProductBase such that "scalar * (mat * mat)" is ↵ | 2009-08-11 | ||
| | | | | | | optimized as one could naturally expect | |||
* | more product refactoring | 2009-08-06 | ||
| | ||||
* | fix vs.net compilation issue | 2009-08-06 | ||
| | ||||
* | add explicit "on the right" triangular solving, | 2009-07-30 | ||
| | | | | => no temporary when the rhs/unknows is row major | |||
* | trmm is now working in all storage order configurations | 2009-07-27 | ||
| | ||||
* | add WIP trsm | 2009-07-24 | ||
| | ||||
* | some cleaning | 2009-07-24 | ||
| | ||||
* | Implement efficient sefladjoint product (aka SYRK) : C += alpha * U U^T | 2009-07-23 | ||
| | | | | | It is currently available via SelfAdjointView::rankKupdate. TODO: allows to write SelfAdjointView += u * u.adjoint() | |||
* | implement high level API for SYMM and fix a couple of bugs related to complex | 2009-07-22 | ||
| | ||||
* | * GEMM enhencement: no need to pre-transpose the rhs | 2009-07-22 | ||
| | | | | | | | => faster a * b.transpose() product => this also fix a bug in a so far untested situation * SYMM is now ready for use => still have to write the high level stuff to convert natural expressions into a call to SYMM | |||
* | more refactoring in the level3 products | 2009-07-22 | ||
| | ||||
* | * refactoring of the matrix product into multiple small kernels | 2009-07-21 | ||
| | | | | | * started an efficient selfadjoint matrix * general matrix product based on the generic kernels ( => need a very little LOC) | |||
* | Add an efficient rank2 update function (like the level2 blas xSYR2 routine). | 2009-07-11 | ||
| | | | | Note that it is already used in Tridiagonalization. | |||
* | finally directly calling the low-level products is faster | 2009-07-10 | ||
| |