| Commit message (Collapse) | Author | Age |
|
|
|
| |
Also, document LinSpaced only where it is implemented
|
| |
|
| |
|
|
|
|
| |
runtime
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Not having this attribute results in the following failures in the `--config=rocm` TF build.
```
In file included from tensorflow/core/kernels/cross_op_gpu.cu.cc:20:
In file included from ./tensorflow/core/framework/register_types.h:20:
In file included from ./tensorflow/core/framework/numeric_types.h:20:
In file included from ./third_party/eigen3/unsupported/Eigen/CXX11/Tensor:1:
In file included from external/eigen_archive/unsupported/Eigen/CXX11/Tensor:140:
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorChipping.h:356:37: error: 'Eigen::constCast': no overloaded function has restriction specifiers that are compatible with the ambient context 'data'
typename Storage::Type result = constCast(m_impl.data());
^
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorChipping.h:356:37: error: 'Eigen::constCast': no overloaded function has restriction specifiers that are compatible with the ambient context 'data'
external/eigen_archive/unsupported/Eigen/CXX11/src/Tensor/TensorAssign.h:148:56: note: in instantiation of member function 'Eigen::TensorEvaluator<const Eigen::TensorChippingOp<1, Eigen::TensorMap<Eigen::Tensor<int, 2, 1, long>, 16, MakePointer> >, Eigen::Gpu\
Device>::data' requested here
return m_rightImpl.evalSubExprsIfNeeded(m_leftImpl.data());
```
Adding the EIGEN_DEVICE_FUNC attribute resolves those errors
|
|\
| |
| |
| |
| |
| |
| | |
[SYCL] :
Approved-by: Gael Guennebaud <g.gael@free.fr>
Approved-by: Rasmus Larsen <rmlarsen@google.com>
|
| | |
|
| |
| |
| |
| |
| |
| |
| | |
* Modifying TensorDeviceSYCL to use `EIGEN_THROW_X`.
* Modifying TensorMacro to use `EIGEN_TRY/CATCH(X)` macro.
* Modifying TensorReverse.h to use `EIGEN_DEVICE_REF` instead of `&`.
* Fixing the SYCL device macro in SpecialFunctionsImpl.h.
|
| | |
|
|/
|
|
|
| |
* an interface for SYCL buffers to behave as a non-dereferenceable pointer
* an interface for placeholder accessor to behave like a pointer on both host and device
|
| |
|
| |
|
|
|
|
| |
eigen::GpuDevice::synchronize() from device code, but not when calling from a non-GPU compilation unit.
|
| |
|
|\ |
|
| |
| |
| |
| | |
block access when preferred
|
|/|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
module required to run it on devices supporting SYCL.
* Abstracting the pointer type so that both SYCL memory and pointer can be captured.
* Converting SYCL virtual pointer to SYCL device memory in Eigen evaluator class.
* Binding SYCL placeholder accessor to command group handler by using bind method in Eigen evaluator node.
* Adding SYCL macro for controlling loop unrolling.
* Modifying the TensorDeviceSycl.h and SYCL executor method to adopt the above changes.
|
|/
|
|
|
|
|
|
|
|
| |
module required to run it on devices supporting SYCL.
* Abstracting the pointer type so that both SYCL memory and pointer can be captured.
* Converting SYCL virtual pointer to SYCL device memory in Eigen evaluator class.
* Binding SYCL placeholder accessor to command group handler by using bind method in Eigen evaluator node.
* Adding SYCL macro for controlling loop unrolling.
* Modifying the TensorDeviceSycl.h and SYCL executor method to adopt the above changes.
|
|
|
|
|
|
|
|
| |
Eigen unsupported modules on devices supporting SYCL.
* Adding SYCL memory model
* Enabling/Disabling SYCL backend in Core
* Supporting Vectorization
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
clause.
|
| |
|
|
|
|
|
|
| |
1. Fix buggy pcmp_eq and unit test for half types.
2. Add unit test for pselect and add specializations for SSE 4.1, AVX512, and half types.
3. Get rid of FIXME: Implement faster pnegate for half by XOR'ing with a sign bit mask.
|
|
|
|
|
| |
(grafted from 427f2f66d69ae9b124c2f8bcd927fb6e19e07e91
)
|
|
|
|
| |
EIGEN_HAS_TYPE_TRAITS is off.
|
|
|
|
| |
clang.
|
|\
| |
| |
| |
| |
| | |
Minor build improvements
Approved-by: Rasmus Larsen <rmlarsen@google.com>
|
| |
| |
| |
| | |
CUDA build failures.
|
|/
|
|
|
|
|
|
| |
* Allow specifying multiple GPU architectures. E.g.:
cmake -DEIGEN_CUDA_COMPUTE_ARCH="60;70"
* Pass CUDA SDK path to clang. Without it it will default to /usr/local/cuda
which may not be the right location, if cmake was invoked with
-DCUDA_TOOLKIT_ROOT_DIR=/some/other/CUDA/path
|
|
|
|
| |
Problem reported on https://stackoverflow.com/questions/56395899
|
|\
| |
| |
| | |
fix for HIP build errors that were introduced by a commit earlier this week
|
| | |
|
|/ |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
That was hurting users with compilers that would object to proceed with
that:
"""
./Eigen/src/Core/products/GeneralMatrixVector.h:356:10: error: declaration shadows a static data member of 'general_matrix_vector_product<type-parameter-0-0, type-parameter-0-1, type-parameter-0-2, 1, ConjugateLhs, type-parameter-0-4, type-parameter-0-5, ConjugateRhs, Version>' [-Werror,-Wshadow]
LhsPacketSize = Traits::LhsPacketSize,
^
./Eigen/src/Core/products/GeneralMatrixVector.h:307:22: note: previous declaration is here
static const Index LhsPacketSize = Traits::LhsPacketSize;
"""
|
|
|
|
|
| |
This fixes compilation issues with RealScalar types that are not implicitly castable from Index (e.g. ceres Jet types).
Reported by Peter Anderson-Sprecher via eMail
|
|
|
|
|
|
| |
https://reviews.llvm.org/D16177
and are part of LLVM 3.8.0.
|
|\
| |
| |
| |
| |
| | |
Make Eigen build with cuda 10 and clang.
Approved-by: Justin Lebar <justin.lebar@gmail.com>
|
| |\ |
|
|\ \ \
| | | |
| | | |
| | | | |
Eigen: Fix MSVC C++17 language standard detection logic
|
| | | | |
|
|\ \ \ \
| | | | |
| | | | |
| | | | | |
Always evaluate Tensor expressions with broadcasting via tiled evaluation code path
|
|\ \ \ \ \
| | | | | |
| | | | | |
| | | | | |
| | | | | |
| | | | | | |
Speed up GEMV on AVX-512 builds, just as done for GEBP previously.
Approved-by: Rasmus Larsen <rmlarsen@google.com>
|
| |/ / / /
|/| | | |
| | | | |
| | | | | |
code path
|
| |_|/ /
|/| | | |
|
| |_|/
|/| | |
|