| Commit message (Collapse) | Author | Age |
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Confirm that install directory is identical
before and after this simplifying patch.
```bash
hg clone <<Eigen>>
mkdir eigen-bld
cd eigen-bld
cmake ../Eigen -DCMAKE_INSTALL_PREFIX:PATH=/tmp/bef
make install
find /tmp/pre_eigen_modernize >/tmp/bef
# Apply this patch
cmake ../Eigen -DCMAKE_INSTALL_PREFIX:PATH=/tmp/aft
make install
find /tmp/post_eigen_modernize |sed 's/post_e/pre_e/g' >/tmp/aft
diff /tmp/bef /tmp/aft
```
|
|
|
|
|
|
|
|
|
| |
Features committed in 2016 have required cmake verison 2.8.11.
`sergiu Tue Nov 22 12:25:06 2016 +0100: target_compile_definitions`
Set the minimum cmake version to the minimum version that
is capable of compiling or installing the code base.
|
| |
|
|
|
|
|
| |
Ancient CMake versions required upper-case commands. Later command names
became case-insensitive. Now the preferred style is lower-case.
|
|
|
|
|
|
| |
Ancient versions of CMake required else(), endif(), and similar block
termination commands to have arguments matching the command starting the block.
This is no longer the preferred style.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
2. Simplify handling of special cases by taking advantage of the fact that the
builtin vrsqrt approximation handles negative, zero and +inf arguments correctly.
This speeds up the SSE and AVX implementations by ~20%.
3. Make the Newton-Raphson formula used for rsqrt more numerically robust:
Before: y = y * (1.5 - x/2 * y^2)
After: y = y * (1.5 - y * (x/2) * y)
Forming y^2 can overflow for very large or very small (denormalized) values of x, while x*y ~= 1. For AVX512, this makes it possible to compute accurate results for denormal inputs down to ~1e-42 in single precision.
4. Add a faster double precision implementation for Knights Landing using the vrsqrt28 instruction and a single Newton-Raphson iteration.
Benchmark results: https://bitbucket.org/snippets/rmlarsen/5LBq9o
|
|
|
|
| |
plog/pexp, but the later was disabled on some compilers
|
| |
|
|
|
|
| |
constant ADs.
|
| |
|
| |
|
| |
|
|
|
|
| |
fixed-size matrices.
|
| |
|
| |
|
|
|
|
| |
Patch adapted from Hans Johnson's PR 748.
|
|
|
|
| |
device.deallocate() / evaluator.cleanup() complete, since the device might be destroyed after on_done() runs.
|
| |
|
| |
|
|
|
|
| |
Add a new EIGEN_HAS_INTRINSIC_INT128 macro, and use this instead of __SIZEOF_INT128__. This fixes related issues with TensorIntDiv.h when building with Clang for Windows, where support for 128-bit integer arithmetic is advertised but broken in practice.
|
|
|
|
|
|
|
|
| |
https://bitbucket.org/eigen/eigen/commits/668ab3fc474e54c7919eda4fbaf11f3a99246494
.
std::array is still not supported in CUDA device code on Windows.
|
|\
| |
| |
| | |
Remove internal::smart_copy and replace with std::copy
|
| | |
|
|/ |
|
| |
|
| |
|
|
|
|
|
| |
* The specialization of array class in the different namespace for GCC<=6.4
* The implicit call to `std::array` constructor using the initializer list for GCC <=6.1
|
|\
| |
| |
| | |
Fix for the HIP build+test errors.
|
| |
| |
| |
| | |
Add async evaluation to a number of ops.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The errors were introduced by this commit :
After the above mentioned commit, some of the tests started failing with the following error
```
Built target cxx11_tensor_reduction
Building HIPCC object unsupported/test/CMakeFiles/cxx11_tensor_reduction_gpu_5.dir/cxx11_tensor_reduction_gpu_5_generated_cxx11_tensor_reduction_gpu.cu.o
In file included from /home/rocm-user/eigen/unsupported/test/cxx11_tensor_reduction_gpu.cu:16:
In file included from /home/rocm-user/eigen/unsupported/Eigen/CXX11/Tensor:117:
/home/rocm-user/eigen/unsupported/Eigen/CXX11/src/Tensor/TensorBlockV2.h:155:5: error: the field type is not amp-compatible
DestinationBufferKind m_kind;
^
/home/rocm-user/eigen/unsupported/Eigen/CXX11/src/Tensor/TensorBlockV2.h:211:3: error: the field type is not amp-compatible
DestinationBuffer m_destination;
^
```
For some reason HIPCC does not like device code to contain enum types which do not have the base-type explicitly declared. The fix is trivial, explicitly state "int" as the basetype
|
|/
|
|
| |
c++11 functionality with older compilers.
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
reverse op
|
|\
| |
| |
| | |
Block evaluation for TensorChipping + fixed bugs in TensorPadding and TensorSlicing
|
| |
| |
| |
| | |
fallback to std::is_convertible when c++11 is enabled.
|
| | |
|
|/
|
|
| |
TensorSlicing
|
|
|
|
| |
7a43af1a335da2c0489b4119a33ee1cbff0c15d6
|
|
|
|
| |
number of elements available.
|
| |
|
| |
|
| |
|
| |
|
| |
|