aboutsummaryrefslogtreecommitdiffhomepage
path: root/unsupported/Eigen/CXX11/src/Tensor/TensorCostModel.h
diff options
context:
space:
mode:
authorGravatar Benoit Steiner <benoit.steiner.goog@gmail.com>2016-05-16 08:55:21 -0700
committerGravatar Benoit Steiner <benoit.steiner.goog@gmail.com>2016-05-16 08:55:21 -0700
commit83ef39e055dda0d6c1b0c84924a0733a68577aa3 (patch)
tree121c327b3fd3dc0bfa808fb161ad77023c5a53eb /unsupported/Eigen/CXX11/src/Tensor/TensorCostModel.h
parentb789a26804e48c03e8cac3aae70d09557b6d8d8b (diff)
Turn on the cost model by default. This results in some significant speedups for smaller tensors. For example, below are the results for the various tensor reductions.
Before: BM_colReduction_12T/10 1000000 1949 51.29 MFlops/s BM_colReduction_12T/80 100000 15636 409.29 MFlops/s BM_colReduction_12T/640 20000 95100 4307.01 MFlops/s BM_colReduction_12T/4K 500 4573423 5466.36 MFlops/s BM_colReduction_4T/10 1000000 1867 53.56 MFlops/s BM_colReduction_4T/80 500000 5288 1210.11 MFlops/s BM_colReduction_4T/640 10000 106924 3830.75 MFlops/s BM_colReduction_4T/4K 500 9946374 2513.48 MFlops/s BM_colReduction_8T/10 1000000 1912 52.30 MFlops/s BM_colReduction_8T/80 200000 8354 766.09 MFlops/s BM_colReduction_8T/640 20000 85063 4815.22 MFlops/s BM_colReduction_8T/4K 500 5445216 4591.19 MFlops/s BM_rowReduction_12T/10 1000000 2041 48.99 MFlops/s BM_rowReduction_12T/80 100000 15426 414.87 MFlops/s BM_rowReduction_12T/640 50000 39117 10470.98 MFlops/s BM_rowReduction_12T/4K 500 3034298 8239.14 MFlops/s BM_rowReduction_4T/10 1000000 1834 54.51 MFlops/s BM_rowReduction_4T/80 500000 5406 1183.81 MFlops/s BM_rowReduction_4T/640 50000 35017 11697.16 MFlops/s BM_rowReduction_4T/4K 500 3428527 7291.76 MFlops/s BM_rowReduction_8T/10 1000000 1925 51.95 MFlops/s BM_rowReduction_8T/80 200000 8519 751.23 MFlops/s BM_rowReduction_8T/640 50000 33441 12248.42 MFlops/s BM_rowReduction_8T/4K 1000 2852841 8763.19 MFlops/s After: BM_colReduction_12T/10 50000000 59 1678.30 MFlops/s BM_colReduction_12T/80 5000000 725 8822.71 MFlops/s BM_colReduction_12T/640 20000 90882 4506.93 MFlops/s BM_colReduction_12T/4K 500 4668855 5354.63 MFlops/s BM_colReduction_4T/10 50000000 59 1687.37 MFlops/s BM_colReduction_4T/80 5000000 737 8681.24 MFlops/s BM_colReduction_4T/640 50000 108637 3770.34 MFlops/s BM_colReduction_4T/4K 500 7912954 3159.38 MFlops/s BM_colReduction_8T/10 50000000 60 1657.21 MFlops/s BM_colReduction_8T/80 5000000 726 8812.48 MFlops/s BM_colReduction_8T/640 20000 91451 4478.90 MFlops/s BM_colReduction_8T/4K 500 5441692 4594.16 MFlops/s BM_rowReduction_12T/10 20000000 93 1065.28 MFlops/s BM_rowReduction_12T/80 2000000 950 6730.96 MFlops/s BM_rowReduction_12T/640 50000 38196 10723.48 MFlops/s BM_rowReduction_12T/4K 500 3019217 8280.29 MFlops/s BM_rowReduction_4T/10 20000000 93 1064.30 MFlops/s BM_rowReduction_4T/80 2000000 959 6667.71 MFlops/s BM_rowReduction_4T/640 50000 37433 10941.96 MFlops/s BM_rowReduction_4T/4K 500 3036476 8233.23 MFlops/s BM_rowReduction_8T/10 20000000 93 1072.47 MFlops/s BM_rowReduction_8T/80 2000000 959 6670.04 MFlops/s BM_rowReduction_8T/640 50000 38069 10759.37 MFlops/s BM_rowReduction_8T/4K 1000 2758988 9061.29 MFlops/s
Diffstat (limited to 'unsupported/Eigen/CXX11/src/Tensor/TensorCostModel.h')
-rw-r--r--unsupported/Eigen/CXX11/src/Tensor/TensorCostModel.h5
1 files changed, 2 insertions, 3 deletions
diff --git a/unsupported/Eigen/CXX11/src/Tensor/TensorCostModel.h b/unsupported/Eigen/CXX11/src/Tensor/TensorCostModel.h
index 4cb37a651..cb6fb4626 100644
--- a/unsupported/Eigen/CXX11/src/Tensor/TensorCostModel.h
+++ b/unsupported/Eigen/CXX11/src/Tensor/TensorCostModel.h
@@ -10,9 +10,8 @@
#ifndef EIGEN_CXX11_TENSOR_TENSOR_COST_MODEL_H
#define EIGEN_CXX11_TENSOR_TENSOR_COST_MODEL_H
-//#if !defined(EIGEN_USE_GPU)
-//#define EIGEN_USE_COST_MODEL
-//#endif
+// Turn on the cost model by default
+#define EIGEN_USE_COST_MODEL
namespace Eigen {