eigen - C++ library for linear algebra

	Commit message (Collapse)	Author	Age
*	fix clang compilation	Gael Guennebaud	2016-07-04
\|
*	Workaround compilation issue with msvc	Gael Guennebaud	2016-07-04
\|
*	Made it possible to compile reductions for an old cuda architecture and run ↵	Benoit Steiner	2016-06-29
\| \| \| \|	them on a recent gpu.
*	Made the code compile when using CUDA architecture < 300	Benoit Steiner	2016-06-29
\|
*	Add missing CUDA kernel to tensor scan op	Igor Babuschkin	2016-06-29
\| \| \| \| \|	The TensorScanOp implementation was missing a CUDA kernel launch. This adds a simple placeholder implementation.
*	Don't store the scan axis in the evaluator of the tensor scan operation ↵	Benoit Steiner	2016-06-27
\| \| \| \| \| \|	since it's only used in the constructor. Also avoid taking references to values that may becomes stale after a copy construction.
*	Return -1 from CurrentThreadId when called by thread outside the pool.	Rasmus Munk Larsen	2016-06-23
\|
*	Resolve merge.	Rasmus Munk Larsen	2016-06-23
\|\
\| *	bug #1241: does not emmit anything for empty tensors	Gael Guennebaud	2016-06-23
\| \|
\| *	merge PR 194	Gael Guennebaud	2016-06-23
\| \|\
\| * \|	Silenced a couple of compilation warnings generated by xcode	Benoit Steiner	2016-06-22
\| \| \|
\| * \|	Turned the constructor of the PerThread struct into what is effectively a ↵	Benoit Steiner	2016-06-22
\| \| \| \| \| \| \| \| \| \| \| \|	constant expression to make the code compatible with a wider range of compilers
\| * \|	Handle empty tensors in the print functions	Benoit Steiner	2016-06-21
\| \| \|
\| * \|	Fixed the printing of rank-0 tensors	Benoit Steiner	2016-06-20
\| \| \|
\| * \|	Merged in ibab/eigen (pull request PR-197)	Benoit Steiner	2016-06-14
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	Implement exclusive scan option for Tensor library
\| * \| \|	Avoid generating pseudo random numbers that are multiple of 5: this helps	Benoit Steiner	2016-06-14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	spread the load over multiple cpus without havind to rely on work stealing.
\| \| * \|	Implement exclusive scan option	Igor Babuschkin	2016-06-14
\| \|/ /
\| \| *	merge	Gael Guennebaud	2016-06-14
\| \| \|\ \| \| \|/ \| \|/\|
\| \| *	Update Tensor module to use bind1st_op and bind2nd_op	Gael Guennebaud	2016-06-14
\| \| \|
\| * \|	Merged in ibab/eigen (pull request PR-195)	Benoit Steiner	2016-06-10
\| \|\ \ \| \| \| \| \| \| \| \| \| \| \| \|	Add small fixes to TensorScanOp
\| * \| \|	Don't refer to the half2 type unless it's been defined	Benoit Steiner	2016-06-10
\| \| \| \|
\| \| * \|	Add small fixes to TensorScanOp	Igor Babuschkin	2016-06-07
\| \| \| \|
* \| \| \|	size_t -> int	Rasmus Munk Larsen	2016-06-03
\| \| \| \|
* \| \| \|	Add CurrentThreadId and NumThreads methods to Eigen threadpools and ↵	Rasmus Munk Larsen	2016-06-03
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TensorDeviceThreadPool.
\| * \| \|	Simplified the code that dispatches vectorized reductions on GPU	Benoit Steiner	2016-06-09
\| \| \| \|
\| * \| \|	Fixed definition of some of the reducer_traits	Benoit Steiner	2016-06-09
\| \| \| \|
\| * \| \|	Use signed integers more consistently to encode the number of threads to use ↵	Benoit Steiner	2016-06-09
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	to evaluate a tensor expression.
\| * \| \|	Improved code formatting	Benoit Steiner	2016-06-09
\| \| \| \|
\| * \| \|	Improved support for vectorization of 16-bit floats	Benoit Steiner	2016-06-09
\| \| \| \|
\| * \| \|	Added missing EIGEN_DEVICE_FUNC	Benoit Steiner	2016-06-07
\| \|/ /
\| * \|	Fixed compilation error with gcc 4.4	Benoit Steiner	2016-06-06
\| \| \|
\| * \|	Misc small improvements to the reduction code.	Benoit Steiner	2016-06-06
\| \| \|
\| * \|	Moved assertions to the constructor to make the code more portable	Benoit Steiner	2016-06-06
\| \|/
\| *	Add TernaryFunctors and the betainc SpecialFunction.	Eugene Brevdo	2016-06-02
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TernaryFunctors and their executors allow operations on 3-tuples of inputs. API fully implemented for Arrays and Tensors based on binary functors. Ported the cephes betainc function (regularized incomplete beta integral) to Eigen, with support for CPU and GPU, floats, doubles, and half types. Added unit tests in array.cpp and cxx11_tensor_cuda.cu Collapsed revision * Merged helper methods for betainc across floats and doubles. * Added TensorGlobalFunctions with betainc(). Removed betainc() from TensorBase. * Clean up CwiseTernaryOp checks, change igamma_helper to cephes_helper. * betainc: merge incbcf and incbd into incbeta_cfe. and more cleanup. * Update TernaryOp and SpecialFunctions (betainc) based on review comments.
\| *	Use array_prod to compute the number of elements contained in the input ↵	Benoit Steiner	2016-06-04
\| \| \| \| \| \| \| \|	tensor expression
\| *	Merged in ibab/eigen (pull request PR-192)	Benoit Steiner	2016-06-03
\| \|\ \| \| \| \| \| \| \| \| \|	Add generic scan method
\| * \|	Improved the performance of full reductions.	Benoit Steiner	2016-06-03
\|/ / \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	AFTER: BM_fullReduction/10 4541 4543 154017 21.0M items/s BM_fullReduction/64 5191 5193 100000 752.5M items/s BM_fullReduction/512 9588 9588 71361 25.5G items/s BM_fullReduction/4k 244314 244281 2863 64.0G items/s BM_fullReduction/5k 359382 359363 1946 64.8G items/s BEFORE: BM_fullReduction/10 9085 9087 74395 10.5M items/s BM_fullReduction/64 9478 9478 72014 412.1M items/s BM_fullReduction/512 14643 14646 46902 16.7G items/s BM_fullReduction/4k 260338 260384 2678 60.0G items/s BM_fullReduction/5k 385076 385178 1818 60.5G items/s
\| *	Add generic scan method	Igor Babuschkin	2016-06-03
\|/
*	Align the first element of the Waiter struct instead of padding it. This ↵	Benoit Steiner	2016-06-02
\| \| \| \|	reduces its memory footprint a bit while achieving the goal of preventing false sharing
*	Add syntactic sugar to Eigen tensors to allow more natural syntax.	Rasmus Munk Larsen	2016-06-02
\| \| \| \| \| \| \| \| \|	Specifically, this enables expressions involving: scalar + tensor scalar * tensor scalar / tensor scalar - tensor
*	Add tensor scan op	Igor Babuschkin	2016-06-02
\| \| \| \| \|	This is the initial implementation a generic scan operation. Based on this, cumsum and cumprod method have been added to TensorBase.
*	Use a single PacketSize variable	Benoit Steiner	2016-06-01
\|
*	Fixed compilation warning	Benoit Steiner	2016-06-01
\|
*	Silenced compilation warning generated by nvcc.	Benoit Steiner	2016-06-01
\|
*	Added support for mean reductions on fp16	Benoit Steiner	2016-06-01
\|
*	Only enable optimized reductions of fp16 if the reduction functor supports them	Benoit Steiner	2016-05-31
\|
*	Reimplement clamp as a static function.	Benoit Steiner	2016-05-27
\|
*	Use NULL instead of nullptr to preserve the compatibility with cxx03	Benoit Steiner	2016-05-27
\|
*	Added a new operation to enable more powerful tensorindexing.	Benoit Steiner	2016-05-27
\|
*	Fixed some compilation warnings	Benoit Steiner	2016-05-26
\|