eigen - C++ library for linear algebra

	Commit message (Collapse)	Author	Age
*	Add TernaryFunctors and the betainc SpecialFunction.	Eugene Brevdo	2016-06-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TernaryFunctors and their executors allow operations on 3-tuples of inputs. API fully implemented for Arrays and Tensors based on binary functors. Ported the cephes betainc function (regularized incomplete beta integral) to Eigen, with support for CPU and GPU, floats, doubles, and half types. Added unit tests in array.cpp and cxx11_tensor_cuda.cu Collapsed revision * Merged helper methods for betainc across floats and doubles. * Added TensorGlobalFunctions with betainc(). Removed betainc() from TensorBase. * Clean up CwiseTernaryOp checks, change igamma_helper to cephes_helper. * betainc: merge incbcf and incbd into incbeta_cfe. and more cleanup. * Update TernaryOp and SpecialFunctions (betainc) based on review comments.
*	Use array_prod to compute the number of elements contained in the input ↵	Benoit Steiner	2016-06-04
\| \| \| \|	tensor expression
*	Merged in ibab/eigen (pull request PR-192)	Benoit Steiner	2016-06-03
\|\ \| \| \| \| \| \|	Add generic scan method
* \|	Improved the performance of full reductions.	Benoit Steiner	2016-06-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	AFTER: BM_fullReduction/10 4541 4543 154017 21.0M items/s BM_fullReduction/64 5191 5193 100000 752.5M items/s BM_fullReduction/512 9588 9588 71361 25.5G items/s BM_fullReduction/4k 244314 244281 2863 64.0G items/s BM_fullReduction/5k 359382 359363 1946 64.8G items/s BEFORE: BM_fullReduction/10 9085 9087 74395 10.5M items/s BM_fullReduction/64 9478 9478 72014 412.1M items/s BM_fullReduction/512 14643 14646 46902 16.7G items/s BM_fullReduction/4k 260338 260384 2678 60.0G items/s BM_fullReduction/5k 385076 385178 1818 60.5G items/s
\| *	Add generic scan method	Igor Babuschkin	2016-06-03
\|/
*	Align the first element of the Waiter struct instead of padding it. This ↵	Benoit Steiner	2016-06-02
\| \| \| \|	reduces its memory footprint a bit while achieving the goal of preventing false sharing
*	Add syntactic sugar to Eigen tensors to allow more natural syntax.	Rasmus Munk Larsen	2016-06-02
\| \| \| \| \| \| \| \| \|	Specifically, this enables expressions involving: scalar + tensor scalar * tensor scalar / tensor scalar - tensor
*	Add tensor scan op	Igor Babuschkin	2016-06-02
\| \| \| \| \|	This is the initial implementation a generic scan operation. Based on this, cumsum and cumprod method have been added to TensorBase.
*	Use a single PacketSize variable	Benoit Steiner	2016-06-01
\|
*	Fixed compilation warning	Benoit Steiner	2016-06-01
\|
*	Silenced compilation warning generated by nvcc.	Benoit Steiner	2016-06-01
\|
*	Added support for mean reductions on fp16	Benoit Steiner	2016-06-01
\|
*	Only enable optimized reductions of fp16 if the reduction functor supports them	Benoit Steiner	2016-05-31
\|
*	Reimplement clamp as a static function.	Benoit Steiner	2016-05-27
\|
*	Use NULL instead of nullptr to preserve the compatibility with cxx03	Benoit Steiner	2016-05-27
\|
*	Added a new operation to enable more powerful tensorindexing.	Benoit Steiner	2016-05-27
\|
*	Fixed some compilation warnings	Benoit Steiner	2016-05-26
\|
*	Preserve the ability to vectorize the evaluation of an expression even when ↵	Benoit Steiner	2016-05-26
\| \| \| \|	it involves a cast that isn't vectorized (e.g fp16 to float)
*	Resolved merge conflicts	Benoit Steiner	2016-05-26
\|
*	Merged latest reduction improvements	Benoit Steiner	2016-05-26
\|\
* \|	Improved the performance of inner reductions.	Benoit Steiner	2016-05-26
\| \|
* \|	Code cleanup.	Benoit Steiner	2016-05-26
\| \|
* \|	Made the static storage class qualifier come first.	Benoit Steiner	2016-05-25
\| \|
* \|	Deleted unnecessary explicit qualifiers.	Benoit Steiner	2016-05-25
\| \|
* \|	Don't mark inline functions as static since it confuses the ICC compiler	Benoit Steiner	2016-05-25
\| \|
* \|	Marked unused variables as such	Benoit Steiner	2016-05-25
\| \|
* \|	Made the IndexPair code compile in non cxx11 mode	Benoit Steiner	2016-05-25
\| \|
* \|	Made the index pair list code more portable accross various compilers	Benoit Steiner	2016-05-25
\| \|
* \|	Improved the performance of tensor padding	Benoit Steiner	2016-05-25
\| \|
* \|	Added support for statically known lists of pairs of indices	Benoit Steiner	2016-05-25
\| \|
* \|	There is no need to make the fp16 full reduction kernel a static function.	Benoit Steiner	2016-05-24
\| \|
* \|	Fixed compilation warning	Benoit Steiner	2016-05-24
\| \|
* \|	Merged in rmlarsen/eigen (pull request PR-188)	Benoit Steiner	2016-05-23
\|\ \ \| \| \| \| \| \| \| \| \|	Minor cleanups: 1. Get rid of a few unused variables. 2. Get rid of last uses of EIGEN_USE_COST_MODEL.
* \| \|	Fix some sign-compare warnings	Christoph Hertzberg	2016-05-22
\| \| \|
* \| \|	Make EIGEN_HAS_CONSTEXPR user configurable	Gael Guennebaud	2016-05-20
\| \| \|
* \| \|	Make EIGEN_HAS_VARIADIC_TEMPLATES user configurable	Gael Guennebaud	2016-05-20
\| \| \|
* \| \|	Make EIGEN_HAS_RVALUE_REFERENCES user configurable	Gael Guennebaud	2016-05-20
\| \| \|
* \| \|	Rename EIGEN_HAVE_RVALUE_REFERENCES to EIGEN_HAS_RVALUE_REFERENCES	Gael Guennebaud	2016-05-20
\| \| \|
\| * \|	Merged eigen/eigen into default	Rasmus Larsen	2016-05-18
\| \|\ \ \| \|/ / \|/\| \|
\| * \|	Merge.	Rasmus Munk Larsen	2016-05-18
\| \|\ \
\| * \| \|	Minor cleanups: 1. Get rid of unused variables. 2. Get rid of last uses of ↵	Rasmus Munk Larsen	2016-05-18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	EIGEN_USE_COST_MODEL.
\| \| * \|	Reduce overhead for small tensors and cheap ops by short-circuiting the ↵	Rasmus Munk Larsen	2016-05-17
\| \|/ / \| \| \| \| \| \| \| \| \|	const computation and block size calculation in parallelFor.
\| \| *	Allow vectorized padding on GPU. This helps speed things up a little.	Benoit Steiner	2016-05-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Before: BM_padding/10 5000000 460 217.03 MFlops/s BM_padding/80 5000000 460 13899.40 MFlops/s BM_padding/640 5000000 461 888421.17 MFlops/s BM_padding/4K 5000000 460 54316322.55 MFlops/s After: BM_padding/10 5000000 454 220.20 MFlops/s BM_padding/80 5000000 455 14039.86 MFlops/s BM_padding/640 5000000 452 904968.83 MFlops/s BM_padding/4K 5000000 411 60750049.21 MFlops/s
* \| \|	Advertize the packet api of the tensor reducers iff the corresponding packet ↵	Benoit Steiner	2016-05-18
\|/ / \| \| \| \| \| \|	primitives are available.
* \|	#if defined(EIGEN_USE_NONBLOCKING_THREAD_POOL) is now #if ↵	Benoit Steiner	2016-05-17
\| \| \| \| \| \| \| \|	!defined(EIGEN_USE_SIMPLE_THREAD_POOL): the non blocking thread pool is the default since it's more scalable, and one needs to request the old thread pool explicitly.
* \|	Fixed compilation error	Benoit Steiner	2016-05-17
\| \|
* \|	Fixed compilation error in the tensor thread pool	Benoit Steiner	2016-05-17
\| \|
* \|	Merge upstream.	Rasmus Munk Larsen	2016-05-17
\|\ \
* \| \|	Roll back changes to core. Move include of TensorFunctors.h up to satisfy ↵	Rasmus Munk Larsen	2016-05-17
\| \| \| \| \| \| \| \| \| \| \| \|	dependence in TensorCostModel.h.
\| * \|	Merged eigen/eigen into default	Rasmus Larsen	2016-05-17
\|/\| \|