eigen - C++ library for linear algebra

	Commit message (Collapse)	Author	Age
*	Made the CUDA architecture level a build setting.	Benoit Steiner	2016-02-25
\|
*	Fixed a typo in the reduction code that could prevent large full reductionsx ↵	Benoit Steiner	2016-02-24
\| \| \| \|	from running properly on old cuda devices.
*	Marked the And and Or reducers as stateless.	Benoit Steiner	2016-02-24
\|
*	merge	Gael Guennebaud	2016-02-23
\|\
* \|	Fix startRow()/startCol() for dense Block with direct access:	Gael Guennebaud	2016-02-23
\| \| \| \| \| \| \| \|	the initial implementation failed for empty rows/columns for which are ambiguous.
\| *	Updated the padding code to work with half floats	Benoit Steiner	2016-02-23
\| \|
\| *	Extended the tensor benchmark suite to support types other than floats	Benoit Steiner	2016-02-23
\| \|
\| *	Updated the tensor benchmarking code to work with compilers that don't ↵	Benoit Steiner	2016-02-23
\| \| \| \| \| \| \| \|	support cxx11.
\| *	Deleted the coordinate based evaluation of tensor expressions, since it's ↵	Benoit Steiner	2016-02-22
\| \| \| \| \| \| \| \|	hardly ever used and started to cause some issues with some versions of xcode.
\| *	Declare the half float type as arithmetic.	Benoit Steiner	2016-02-22
\| \|
\| *	include <iostream> in the tensor header since we now use it to better report ↵	Benoit Steiner	2016-02-22
\| \| \| \| \| \| \| \|	cuda initialization errors
\| *	Fixed compilation warning generated by clang	Benoit Steiner	2016-02-21
\| \|
\| *	Implemented the ptranspose function on half floats	Benoit Steiner	2016-02-21
\| \|
\| *	Pulled latest updates from trunk	Benoit Steiner	2016-02-21
\| \|\
\| * \|	Added the ability to compute the absolute value of a half float	Benoit Steiner	2016-02-21
\| \| \|
\| \| *	Added some debugging information to the test to figure out why it fails ↵	Benoit Steiner	2016-02-21
\| \| \| \| \| \| \| \| \| \| \| \|	sometimes
\| \| *	Optimized casting of tensors in the case where the casting happens to be a no-op	Benoit Steiner	2016-02-21
\| \|/
\| *	Prevent unecessary Index to int conversions	Benoit Steiner	2016-02-21
\|/
*	Moved some of the fp16 operators outside the Eigen namespace to workaround ↵	Benoit Steiner	2016-02-20
\| \| \| \|	some nvcc limitations.
*	Fixed the float16 tensor test.	Benoit Steiner	2016-02-20
\|
*	Get rid of duplicate code.	Rasmus Munk Larsen	2016-02-19
\|
*	Speed up tensor FFT by up ~25-50%.	Rasmus Munk Larsen	2016-02-19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Benchmark Base (ns) New (ns) Improvement ------------------------------------------------------------------ BM_tensor_fft_single_1D_cpu/8 132 134 -1.5% BM_tensor_fft_single_1D_cpu/9 1162 1229 -5.8% BM_tensor_fft_single_1D_cpu/16 199 195 +2.0% BM_tensor_fft_single_1D_cpu/17 2587 2267 +12.4% BM_tensor_fft_single_1D_cpu/32 373 341 +8.6% BM_tensor_fft_single_1D_cpu/33 5922 4879 +17.6% BM_tensor_fft_single_1D_cpu/64 797 675 +15.3% BM_tensor_fft_single_1D_cpu/65 13580 10481 +22.8% BM_tensor_fft_single_1D_cpu/128 1753 1375 +21.6% BM_tensor_fft_single_1D_cpu/129 31426 22789 +27.5% BM_tensor_fft_single_1D_cpu/256 4005 3008 +24.9% BM_tensor_fft_single_1D_cpu/257 70910 49549 +30.1% BM_tensor_fft_single_1D_cpu/512 8989 6524 +27.4% BM_tensor_fft_single_1D_cpu/513 165402 107751 +34.9% BM_tensor_fft_single_1D_cpu/999 198293 115909 +41.5% BM_tensor_fft_single_1D_cpu/1ki 21289 14143 +33.6% BM_tensor_fft_single_1D_cpu/1k 361980 233355 +35.5% BM_tensor_fft_double_1D_cpu/8 138 131 +5.1% BM_tensor_fft_double_1D_cpu/9 1253 1133 +9.6% BM_tensor_fft_double_1D_cpu/16 218 200 +8.3% BM_tensor_fft_double_1D_cpu/17 2770 2392 +13.6% BM_tensor_fft_double_1D_cpu/32 406 368 +9.4% BM_tensor_fft_double_1D_cpu/33 6418 5153 +19.7% BM_tensor_fft_double_1D_cpu/64 856 728 +15.0% BM_tensor_fft_double_1D_cpu/65 14666 11148 +24.0% BM_tensor_fft_double_1D_cpu/128 1913 1502 +21.5% BM_tensor_fft_double_1D_cpu/129 36414 24072 +33.9% BM_tensor_fft_double_1D_cpu/256 4226 3216 +23.9% BM_tensor_fft_double_1D_cpu/257 86638 52059 +39.9% BM_tensor_fft_double_1D_cpu/512 9397 6939 +26.2% BM_tensor_fft_double_1D_cpu/513 203208 114090 +43.9% BM_tensor_fft_double_1D_cpu/999 237841 125583 +47.2% BM_tensor_fft_double_1D_cpu/1ki 20921 15392 +26.4% BM_tensor_fft_double_1D_cpu/1k 455183 250763 +44.9% BM_tensor_fft_single_2D_cpu/8 1051 1005 +4.4% BM_tensor_fft_single_2D_cpu/9 16784 14837 +11.6% BM_tensor_fft_single_2D_cpu/16 4074 3772 +7.4% BM_tensor_fft_single_2D_cpu/17 75802 63884 +15.7% BM_tensor_fft_single_2D_cpu/32 20580 16931 +17.7% BM_tensor_fft_single_2D_cpu/33 345798 278579 +19.4% BM_tensor_fft_single_2D_cpu/64 97548 81237 +16.7% BM_tensor_fft_single_2D_cpu/65 1592701 1227048 +23.0% BM_tensor_fft_single_2D_cpu/128 472318 384303 +18.6% BM_tensor_fft_single_2D_cpu/129 7038351 5445308 +22.6% BM_tensor_fft_single_2D_cpu/256 2309474 1850969 +19.9% BM_tensor_fft_single_2D_cpu/257 31849182 23797538 +25.3% BM_tensor_fft_single_2D_cpu/512 10395194 8077499 +22.3% BM_tensor_fft_single_2D_cpu/513 144053843 104242541 +27.6% BM_tensor_fft_single_2D_cpu/999 279885833 208389718 +25.5% BM_tensor_fft_single_2D_cpu/1ki 45967677 36070985 +21.5% BM_tensor_fft_single_2D_cpu/1k 619727095 456489500 +26.3% BM_tensor_fft_double_2D_cpu/8 1110 1016 +8.5% BM_tensor_fft_double_2D_cpu/9 17957 15768 +12.2% BM_tensor_fft_double_2D_cpu/16 4558 4000 +12.2% BM_tensor_fft_double_2D_cpu/17 79237 66901 +15.6% BM_tensor_fft_double_2D_cpu/32 21494 17699 +17.7% BM_tensor_fft_double_2D_cpu/33 357962 290357 +18.9% BM_tensor_fft_double_2D_cpu/64 105179 87435 +16.9% BM_tensor_fft_double_2D_cpu/65 1617143 1288006 +20.4% BM_tensor_fft_double_2D_cpu/128 512848 419397 +18.2% BM_tensor_fft_double_2D_cpu/129 7271322 5636884 +22.5% BM_tensor_fft_double_2D_cpu/256 2415529 1922032 +20.4% BM_tensor_fft_double_2D_cpu/257 32517952 24462177 +24.8% BM_tensor_fft_double_2D_cpu/512 10724898 8287617 +22.7% BM_tensor_fft_double_2D_cpu/513 146007419 108603266 +25.6% BM_tensor_fft_double_2D_cpu/999 296351330 221885776 +25.1% BM_tensor_fft_double_2D_cpu/1ki 59334166 48357539 +18.5% BM_tensor_fft_double_2D_cpu/1k 666660132 483840349 +27.4%
*	merge	Gael Guennebaud	2016-02-19
\|\
* \|	Add COD and BDCSVD in list of benched solvers.	Gael Guennebaud	2016-02-19
\| \|
* \|	Extend unit test to stress smart_copy with empty input/output.	Gael Guennebaud	2016-02-19
\| \|
* \|	bug #1170: skip calls to memcpy/memmove for empty imput.	Gael Guennebaud	2016-02-19
\| \|
\| *	Print an error message to stderr when the initialization of the CUDA runtime ↵	Benoit Steiner	2016-02-19
\| \| \| \| \| \| \| \|	fails. This helps debugging setup issues.
* \|	Fix nesting type and complete reflection methods of Block expressions.	Gael Guennebaud	2016-02-19
\| \|
* \|	Add typedefs for the return type of all block methods.	Gael Guennebaud	2016-02-19
\| \|
\| *	Updated the contraction code to make it compatible with half floats.	Benoit Steiner	2016-02-19
\| \|
\| *	Added support for tensor reductions on half floats	Benoit Steiner	2016-02-19
\| \|
\| *	Implemented the scalar division of 2 half floats	Benoit Steiner	2016-02-19
\| \|
\| *	Added the ability to query the minor version of a cuda device	Benoit Steiner	2016-02-19
\| \|
\| *	Started to work on contractions and reductions using half floats	Benoit Steiner	2016-02-19
\| \|
\| *	Don't make the array constructors explicit	Benoit Steiner	2016-02-19
\| \|
\| *	Added support for operators +=, -=, *= and /= on CUDA half floats	Benoit Steiner	2016-02-19
\| \|
\| *	Implemented protate() for CUDA	Benoit Steiner	2016-02-19
\| \|
\| *	Fixed a bug in the tensor type converter	Benoit Steiner	2016-02-19
\| \|
\| *	Added support for simple coefficient wise tensor expression using half ↵	Benoit Steiner	2016-02-19
\| \| \| \| \| \| \| \|	floats on CUDA devices
\| *	FP16 on CUDA are only available starting with cuda 7.5. Disable them when ↵	Benoit Steiner	2016-02-18
\| \| \| \| \| \| \| \|	using an older version of CUDA
\| *	Added regression test for float16	Benoit Steiner	2016-02-19
\| \|
\| *	Reverted unintended changes introduced by a bad merge	Benoit Steiner	2016-02-19
\| \|
\| *	Pulled latest updates from trunk	Benoit Steiner	2016-02-19
\| \|\
\| * \|	Added preliminary support for half floats on CUDA GPU. For now we can simply ↵	Benoit Steiner	2016-02-19
\| \| \| \| \| \| \| \| \| \| \| \|	convert floats into half floats and vice versa
\| * \|	Improved implementation of ptanh for SSE and AVX	Benoit Steiner	2016-02-18
\| \| \|
\| * \|	Merged eigen/eigen into default	Eugene Brevdo	2016-02-17
\| \|\ \ \| \|/ / \|/\| \|
\| * \|	Tiny bugfix in SpecialFunctions: some compilers don't like doubles	Eugene Brevdo	2016-02-17
\| \| \| \| \| \| \| \| \| \| \| \|	implicitly downcast to floats in an array constructor.
* \| \|	bug #1166: fix shortcomming in gemv when the destination is not a vector at ↵	Gael Guennebaud	2016-02-15
\| \| \| \| \| \| \| \| \| \| \| \|	compile-time.
* \| \|	Import wiki's paragraph: "I disabled vectorization, but I'm still getting ↵	Gael Guennebaud	2016-02-12
\| \| \| \| \| \| \| \| \| \| \| \|	annoyed about alignment issues"
* \| \|	bug #795: mention allocate_shared as a condidate for aligned_allocator.	Gael Guennebaud	2016-02-12
\| \| \|