diff options
author | Deven Desai <deven.desai.amd@gmail.com> | 2020-08-19 20:06:39 +0000 |
---|---|---|
committer | Deven Desai <deven.desai.amd@gmail.com> | 2020-08-20 00:29:57 +0000 |
commit | 603e213d13311af286c8c1abd4ea14a8bd3d204e (patch) | |
tree | fce713b0de190f4ee9d5be162a7efb83d0f8754c /test/gpu_common.h | |
parent | c060114a259af3460dc40b388df47c86944f2600 (diff) |
Fixing a CUDA / P100 regression introduced by PR 181
PR 181 ( https://gitlab.com/libeigen/eigen/-/merge_requests/181 ) adds `__launch_bounds__(1024)` attribute to GPU kernels, that did not have that attribute explicitly specified.
That PR seems to cause regressions on the CUDA platform. This PR/commit makes the changes in PR 181, to be applicable for HIP only
Diffstat (limited to 'test/gpu_common.h')
-rw-r--r-- | test/gpu_common.h | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/test/gpu_common.h b/test/gpu_common.h index 509be5942..049e7aade 100644 --- a/test/gpu_common.h +++ b/test/gpu_common.h @@ -29,7 +29,7 @@ void run_on_cpu(const Kernel& ker, int n, const Input& in, Output& out) template<typename Kernel, typename Input, typename Output> __global__ -__launch_bounds__(1024) +EIGEN_HIP_LAUNCH_BOUNDS_1024 void run_on_gpu_meta_kernel(const Kernel ker, int n, const Input* in, Output* out) { int i = threadIdx.x + blockIdx.x*blockDim.x; |