| Commit message (Collapse) | Author | Age |
| |
|
|\ |
|
| |\
| | |
| | |
| | |
| | |
| | | |
Derivative of the incomplete Gamma function and the sample of a Gamma random variable
Approved-by: Benoit Steiner <benoit.steiner.goog@gmail.com>
|
| | | |
|
| | | |
|
| | | |
|
| | | |
|
| | | |
|
| | |\ |
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
variable.
In addition to igamma(a, x), this code implements:
* igamma_der_a(a, x) = d igamma(a, x) / da -- derivative of igamma with respect to the parameter
* gamma_sample_der_alpha(alpha, sample) -- reparameterization derivative of a Gamma(alpha, 1) random variable sample with respect to the alpha parameter
The derivatives are computed by forward mode differentiation of the igamma(a, x) code. Although gamma_sample_der_alpha can be implemented via igamma_der_a, a separate function is more accurate and efficient due to analytical cancellation of some terms. All three functions are implemented by a method parameterized with "mode" that always computes the derivatives, but does not return them unless required by the mode. The compiler is expected to (and, based on benchmarks, does) skip the unnecessary computations depending on the mode.
|
| |/ /
|/| |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This commit enables the use of Eigen on HIP kernels / AMD GPUs. Support has been added along the same lines as what already exists for using Eigen in CUDA kernels / NVidia GPUs.
Application code needs to explicitly define EIGEN_USE_HIP when using Eigen in HIP kernels. This is because some of the CUDA headers get picked up by default during Eigen compile (irrespective of whether or not the underlying compiler is CUDACC/NVCC, for e.g. Eigen/src/Core/arch/CUDA/Half.h). In order to maintain this behavior, the EIGEN_USE_HIP macro is used to switch to using the HIP version of those header files (see Eigen/Core and unsupported/Eigen/CXX11/Tensor)
Use the "-DEIGEN_TEST_HIP" cmake option to enable the HIP specific unit tests.
|
| |/
| |
| |
| |
| | |
specializations. Otherwise causes problems with small fixed size matrix multiplication (call to
0x00 in call_assignment_no_alias in debug mode or trap in release with CUDA 9.1).
|
| | |
|
| |
| |
| |
| | |
indicate a function is not __declspec(nothrow)
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
bug #1548
The macro EIGEN_IDEAL_MAX_ALIGN_BYTES is being incorrectly set to 32
on AVX512 builds. It should be set to 64. In the current code it is
only set to 64 if the macro EIGEN_VECTORIZE_AVX512 is defined. This
macro does get defined in AVX512 builds in Core, but only after Macros.h,
the file that defines EIGEN_IDEAL_MAX_ALIGN_BYTES, has been included.
This commit fixes the issue by setting EIGEN_IDEAL_MAX_ALIGN_BYTES to
64 if __AVX512F__ is defined.
|
|/
|
|
| |
PGI (the later being the one that has the wrong prototype).
|
|
|
|
|
|
| |
The functions are conventionally called i0e and i1e. The exponentially scaled version is more numerically stable. The standard Bessel functions can be obtained as i0(x) = exp(|x|) i0e(x)
The code is ported from Cephes and tested against SciPy.
|
|
|
|
| |
silently reinterpreted as int instead of being converted)
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
rather use __is_enum(T) for old MSVC versions
|
| |
|
|
|
|
|
|
|
|
| |
1) Q is always square
2) Q*R*P' is valid and recovers the original matrix
This implies that the size of Q is the number of rows in the original matrix, square,
and that the size of R is the size of the original matrix.
|
|
|
|
| |
Jeff Trull in PR-386.
|
| |
|
| |
|
| |
|
|
|
|
| |
The workaround is to wrap NEON packet types to make them different c++ types.
|
|\ |
|
| | |
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| | |
the fix in commit 12efc7d41b80259b996be5781bf596c249c90d3f
)
|
| | |
|
|/ |
|
| |
|
| |
|
|
|
|
| |
.setReverseFlag()
|
|
|
|
|
|
|
| |
when compiling a cuda kernel. This fixes the compilation of TensorFlow 1.4 with clang 6.0 used as CUDA compiler with libc++.
This follows the previous change in https://bitbucket.org/eigen/eigen/commits/2a69290ddb165b7103c87ba8f5b257eca23f62aa
, which mentions OSX (I guess because it uses libc++ too).
|
|
|
|
| |
for complex numbers. Made corresponding unit test actually test that. Also simplify implementation of QR decompositions
|
|
|
|
| |
boost::multiprecision)
|
|
|
|
| |
d820ab9edc0b38af4cdb3d545714a0c9083e5a78
|
| |
|
| |
|
|
|
|
| |
a few long-to-int conversions issues.
|
|\
| |
| |
| | |
Add interface to umfpack_*l_* functions
|