eigen - C++ library for linear algebra

	Commit message (Collapse)	Author	Age
...
*	Add a EIGEN_NO_CUDA option, and introduce EIGEN_CUDACC and EIGEN_CUDA_ARCH ↵	Gael Guennebaud	2017-07-17
\| \| \| \|	aliases
*	Pull the latest updates from trunk	Benoit Steiner	2016-10-05
\|\
\| *	Cleanup the cuda executor code.	Benoit Steiner	2016-10-04
\| \|
* \|	Partial OpenCL support via SYCL compatible with ComputeCpp CE.	Luke Iwanski	2016-09-19
\|/
*	Deleted dead code.	Benoit Steiner	2016-07-25
\|
*	bug #1255: comment out broken and unsused line.	Gael Guennebaud	2016-07-25
\|
*	Use a single PacketSize variable	Benoit Steiner	2016-06-01
\|
*	Merge.	Rasmus Munk Larsen	2016-05-18
\|\
* \|	Minor cleanups: 1. Get rid of unused variables. 2. Get rid of last uses of ↵	Rasmus Munk Larsen	2016-05-18
\| \| \| \| \| \| \| \|	EIGEN_USE_COST_MODEL.
\| *	Reduce overhead for small tensors and cheap ops by short-circuiting the ↵	Rasmus Munk Larsen	2016-05-17
\|/ \| \| \|	const computation and block size calculation in parallelFor.
*	#if defined(EIGEN_USE_NONBLOCKING_THREAD_POOL) is now #if ↵	Benoit Steiner	2016-05-17
\| \| \| \|	!defined(EIGEN_USE_SIMPLE_THREAD_POOL): the non blocking thread pool is the default since it's more scalable, and one needs to request the old thread pool explicitly.
*	Fixed compilation error	Benoit Steiner	2016-05-17
\|
*	Address comments by bsteiner.	Rasmus Munk Larsen	2016-05-12
\|
*	Improvements to parallelFor.	Rasmus Munk Larsen	2016-05-12
\| \| \| \|	Move some scalar functors from TensorFunctors. to Eigen core.
*	Strongly hint but don't force the compiler to unroll a some loops in the ↵	Benoit Steiner	2016-05-05
\| \| \| \|	tensor executor. This results in up to 27% faster code.
*	Fixed several compilation warnings	Benoit Steiner	2016-04-21
\|
*	Don't crash when attempting to reduce empty tensors.	Benoit Steiner	2016-04-20
\|
*	Simplified the code that launches cuda kernels.	Benoit Steiner	2016-04-19
\|
*	Avoid an unnecessary copy of the evaluator.	Benoit Steiner	2016-04-19
\|
*	Get rid of void* casting when calling EvalRange::run.	Rasmus Munk Larsen	2016-04-15
\|
*	Eigen Tensor cost model part 2: Thread scheduling for standard evaluators ↵	Rasmus Munk Larsen	2016-04-14
\| \| \| \|	and reductions. The cost model is turned off by default.
*	Defer the decision to vectorize tensor CUDA code to the meta kernel. This ↵	Benoit Steiner	2016-04-12
\| \| \| \|	makes it possible to decide to vectorize or not depending on the capability of the target cuda architecture. In particular, this enables us to vectorize the processing of fp16 when running on device of capability >= 5.3
*	Prevent potential overflow.	Benoit Steiner	2016-03-28
\|
*	Avoid unnecessary conversions	Benoit Steiner	2016-03-23
\|
*	Fixed compilation warning	Benoit Steiner	2016-03-23
\|
*	Use a single Barrier instead of a collection of Notifications to reduce the ↵	Benoit Steiner	2016-03-22
\| \| \| \|	thread synchronization overhead
*	Replace std::vector with our own implementation, as using the stl when ↵	Benoit Steiner	2016-03-08
\| \| \| \|	compiling with nvcc and avx enabled leads to many issues.
*	Fix a couple of typos in the code.	Benoit Steiner	2016-03-07
\|
*	Made it possible to limit the number of blocks that will be used to evaluate ↵	Benoit Steiner	2016-02-01
\| \| \| \|	a tensor expression on a CUDA device. This makesit possible to set aside streaming multiprocessors for other computations.
*	Silenced several compilation warnings triggered by nvcc.	Benoit Steiner	2016-01-11
\|
*	Prevent nvcc from miscompiling the cuda metakernel. Unfortunately this ↵	Benoit Steiner	2016-01-08
\| \| \| \|	reintroduces some compulation warnings but it's much better than having to deal with random assertion failures.
*	Silenced some compilation warnings triggered by nvcc	Benoit Steiner	2015-12-17
\|
*	Don't create more cuda blocks than necessary	Benoit Steiner	2015-11-23
\|
*	Make it possible for a vectorized tensor expression to be executed in a CUDA ↵	Benoit Steiner	2015-11-11
\| \| \| \|	kernel.
*	Fixed CUDA compilation errors	Benoit Steiner	2015-11-11
\|
*	Refined the #ifdef __CUDACC__ guard to ensure that when trying to compile ↵	Benoit Steiner	2015-10-23
\| \| \| \|	gpu code with a non cuda compiler results in a linking error instead of bogus code.
*	Use numext::mini/numext::maxi instead of std::min/std::max in the tensor code	Benoit Steiner	2015-08-28
\|
*	Avoid relying on a default value for the Vectorizable template parameter of ↵	Benoit Steiner	2015-07-15
\| \| \| \|	the EvalRange functor
*	Added support for multi gpu configuration to the GpuDevice class	Benoit Steiner	2015-07-15
\|
*	Enabled the vectorized evaluation of several tensor expressions that was ↵	Benoit Steiner	2015-07-01
\| \| \| \|	previously disabled by mistake
*	Moved away from std::async and std::future as the underlying mechnism for ↵	Benoit Steiner	2015-05-20
\| \| \| \| \| \|	the thread pool device. On several platforms, the functions passed to std::async are not scheduled in the order in which they are given to std::async, which leads to massive performance issues in the contraction code. Instead we now have a custom thread pool that ensures that the functions are picked up by the threads in the pool in the order in which they are enqueued in the pool.
*	Make sure that the copy constructor of the evaluator is always called before ↵	Benoit Steiner	2015-04-21
\| \| \| \|	launching the evaluation of a tensor expression on a cuda device.
*	Fixed off-by-one error that prevented the evaluation of small tensor ↵	Benoit Steiner	2015-02-27
\| \| \| \|	expressions from being vectorized
*	Fixed several compilation warnings reported by clang	Benoit Steiner	2015-02-25
\|
*	Fixed compilation error triggered when trying to vectorize a non ↵	Benoit Steiner	2015-02-10
\| \| \| \|	vectorizable cuda kernel.
*	Ensured that each thread has it's own copy of the TensorEvaluator: this ↵	Benoit Steiner	2015-01-14
\| \| \| \|	avoid race conditions when the evaluator calls a non thread safe functor, eg when generating random numbers.
*	Fixed the evaluation of expressions involving tensors of 2 or 3 elements on ↵	Benoit Steiner	2014-11-18
\| \| \| \|	CUDA devices.
*	Use the proper index type	Benoit Steiner	2014-10-30
\|
*	Misc improvements and cleanups	Benoit Steiner	2014-10-13
\|
*	Fixed the tensor shuffling test	Benoit Steiner	2014-10-10
\|