eigen - C++ library for linear algebra

	Commit message (Collapse)	Author	Age
*	Re-enbale detection of min/max parentheses protection, and re-enable ↵	Gael Guennebaud	2015-02-27
\| \| \| \|	mpreal_support unit test.
*	Reimplement the selection between rotating and non-rotating kernels	Benoit Jacob	2015-02-27
\| \| \| \| \| \|	using templates instead of macros and if()'s. That was needed to fix the build of unit tests on ARM, which I had broken. My bad for not testing earlier.
*	Pulled latest updates from trunk	Benoit Steiner	2015-02-27
\|\
* \|	Fixed off-by-one error that prevented the evaluation of small tensor ↵	Benoit Steiner	2015-02-27
\| \| \| \| \| \| \| \|	expressions from being vectorized
\| *	remove trailing comma	Benoit Jacob	2015-02-27
\| \|
\| *	Disable Packet2f/2i halfpacket support in NEON.	Benoit Jacob	2015-02-27
\| \| \| \| \| \| \| \| \| \| \| \|	I believe that it was erroneously turned on, since Packet2f/2i intrinsics are unimplemented, and code trying to use halfpackets just fails to compile on NEON, as it tries to use the default implementation of pload/pstore and the types don't match.
\| *	Fix NEON build flags: in the current NDK, at least with the clang-3.5 toolchain,	Benoit Jacob	2015-02-27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	-mfpu=neon is not enough to activate NEON, since it's incompatible with the default float ABI, and I have to pass -mfloat-abi=softfp (which is what everyone does in practice). In fact, it would be a good idea to pass -mfloat-abi=softfp all the time, regardless of NEON. Also removing the -mcpu=cortex-a8, as 1) it's not needed and 2) if we really wanted to pass a specific -mcpu flag, that would presumably to tune performance for benchmarks, and it would then not really make sense to tune for the very old cortex-a8 (it reflects ARM CPUs from 5 years ago).
\| *	Replace a static assert by a runtime one, fixes the build of unit tests on ARM	Benoit Jacob	2015-02-27
\|/ \| \| \| \|	Also safely assert in the non-implemented path that should never be taken in practice, and would return wrong results.
*	Fixed another compilation problem with TensorIntDiv.h	Benoit Steiner	2015-02-26
\|
*	Can now use the tensor 'reverse' operation as a lvalue	Benoit Steiner	2015-02-26
\|
*	Added missing copy constructor	Benoit Steiner	2015-02-26
\|
*	Avoid packing rhs multiple-times when blocking on the lhs only.	Gael Guennebaud	2015-02-26
\|
*	Make sure that the block size computation is tested by our unit test.	Gael Guennebaud	2015-02-26
\|
*	Update changeset list to be checked by perf_monitoring/gemm.	Gael Guennebaud	2015-02-26
\|
*	Make perf_monitoring/gemm script more flexible:	Gael Guennebaud	2015-02-26
\| \| \| \| \| \|	- skip existing dataset - add a "-up" option to recompute the dataset (see script header) - allow to specify a filename prefix
*	Implement a more generic blocking-size selection algorithm. See explanations ↵	Gael Guennebaud	2015-02-26
\| \| \| \| \| \| \|	inlines. It performs extremely well on Haswell. The main issue is to reliably and quickly find the actual cache size to be used for our 2nd level of blocking, that is: max(l2,l3/nb_core_sharing_l3)
*	Fix typos in block-size testing code, and set peeling on k to 8.	Gael Guennebaud	2015-02-26
\|
*	Made TensorIntDiv.h compile with MSVC	Benoit Steiner	2015-02-25
\|
*	Fixed another clang warning	Benoit Steiner	2015-02-25
\|
*	Fixed several compilation warnings reported by clang	Benoit Steiner	2015-02-25
\|
*	Silenced a few more compilation warnings generated by nvcc	Benoit Steiner	2015-02-25
\|
*	Added more tests to validate support for tensors laid out in RowMajor order.	Benoit Steiner	2015-02-25
\|
*	Added support for RowMajor layout to the tensor patch extraction cofde.	Benoit Steiner	2015-02-25
\|
*	Pulled latest changes from trunk	Benoit Steiner	2015-02-25
\|\
* \|	Added support for RowMajor layout to the image patch extraction code	Benoit Steiner	2015-02-25
\| \| \| \| \| \| \| \|	Speeded up the unsupported_cxx11_tensor_image_patch test and reduced its memory footprint
\| *	So I extensively measured the impact of the offset in this prefetch. I tried ↵	Benoit Jacob	2015-02-25
\|/ \| \| \| \| \| \| \| \| \| \| \| \| \|	offset values from 0 to 128 (on this float* pointer, so implicitly times 4 bytes). On x86, I tested a Sandy Bridge with AVX with 12M cache and a Haswell with AVX+FMA with 6M cache on MatrixXf sizes up to 2400. I could not see any significant impact of this offset. On Nexus 5, the offset has a slight effect: values around 32 (times sizeof float) are worst. Anything else is the same: the current 64 (8*pk), or... 0. So let's just go with 0! Note that we needed a fix anyway for not accounting for the value of RhsProgress. 0 nicely avoids the issue altogether!
*	bug #970: Add EIGEN_DEVICE_FUNC to RValue functions, in case Cuda supports ↵	Christoph Hertzberg	2015-02-24
\| \| \| \|	RValue-references.
*	Fix my recent prefetch changes:	Benoit Jacob	2015-02-23
\| \| \| \| \| \| \| \| \| \| \|	- the first prefetch is actually harmful on Haswell with FMA, but it is the most beneficial on ARM. - the second prefetch... I was very stupid and multiplied by sizeof(scalar) and offset of a scalar* pointer. The old offset was 64; pk = 8, so 64=pk*8. So this effectively restores the older offset. Actually, there were two prefetches here, one with offset 48 and one with offset 64. I could not confirm any benefit from this strange 48 offset on either the haswell or my ARM device.
*	Add analyze-blocking-sizes program under bench/ to analyze multiple logs	Benoit Jacob	2015-02-23
\| \| \| \|	generated by benchmark-blocking-sizes.
*	Fix two trivial warnings	Christoph Hertzberg	2015-02-22
\|
*	log1p is defined only for real Scalars in C++11	Christoph Hertzberg	2015-02-21
\|
*	I can reproduce any problems that justified this hack. However it makes ↵	Christoph Hertzberg	2015-02-21
\| \| \| \|	builds fail in C++11 mode.
*	Fix compilation of unit tests disabling assertion cheking	Gael Guennebaud	2015-02-21
\|
*	Add benchmark-blocking-sizes.cpp to bench/ per mailing list discussion.	Benoit Jacob	2015-02-20
\|
*	Initial version of a small script to help tracking performance regressions	Gael Guennebaud	2015-02-20
\|
*	update bench_gemm	Gael Guennebaud	2015-02-20
\|
*	Fix doc of Ref<>	Gael Guennebaud	2015-02-20
\|
*	With C++11 Matrix<float> + Matrix<complex<float>> does not even compile	Gael Guennebaud	2015-02-20
\|
*	Remove EIGEN_TEST_C++0x option and let EIGEN_TEST_CXX11 adds the -std=c++11 flag	Gael Guennebaud	2015-02-20
\|
*	In C++11 destructors do not throw by default (fix CommaInitializer unit test)	Gael Guennebaud	2015-02-20
\|
*	Pulled latest changes from trunk	Benoit Steiner	2015-02-19
\|\
* \|	Marked the CUDA packet primitives as EIGEN_DEVICE_FUNC since they'll end up ↵	Benoit Steiner	2015-02-19
\| \| \| \| \| \| \| \|	being executed on the GPU device.
\| *	Fix regression with C++11 support of lambda: now internal::result_of falls ↵	Gael Guennebaud	2015-02-19
\| \| \| \| \| \| \| \|	back to std::result_of in C++11.
\| *	Fix a C++11 compilation issue in unit test	Gael Guennebaud	2015-02-19
\| \|
\| *	Fix some calls to result_of on binary functors as unary ones.	Gael Guennebaud	2015-02-19
\| \|
\| *	Declare const some const variables	Gael Guennebaud	2015-02-19
\|/
*	Pulle latest updates from trunk	Benoit Steiner	2015-02-19
\|\
* \|	Improved the documentations	Benoit Steiner	2015-02-19
\| \|
\| *	Add support for C++11 result_of/lambdas	Gael Guennebaud	2015-02-19
\| \|
\| *	rotating kernel: avoid compiling anything outside of ARM	Benoit Jacob	2015-02-18
\| \|