diff options
author | Gael Guennebaud <g.gael@free.fr> | 2008-07-09 14:04:48 +0000 |
---|---|---|
committer | Gael Guennebaud <g.gael@free.fr> | 2008-07-09 14:04:48 +0000 |
commit | 28539e7597c643dbc2b8d4f49dd16bd86fb7251f (patch) | |
tree | 3bfb96ae898f0afc943a388edcc83c27d28d5b3a /bench/btl/README | |
parent | 5f55ab524cf0aad34dd6884bcae09eaa6c43c247 (diff) |
imported a reworked version of BTL (Benchmark for Templated Libraries).
the modifications to initial code follow:
* changed build system from plain makefiles to cmake
* added eigen2 (4 versions: vec/novec and fixed/dynamic), GMM++, MTL4 interfaces
* added "transposed matrix * vector" product action
* updated blitz interface to use condensed products instead of hand coded loops
* removed some deprecated interfaces
* changed default storage order to column major for all libraries
* new generic bench timer strategy which is supposed to be more accurate
* various code clean-up
Diffstat (limited to 'bench/btl/README')
-rw-r--r-- | bench/btl/README | 143 |
1 files changed, 143 insertions, 0 deletions
diff --git a/bench/btl/README b/bench/btl/README new file mode 100644 index 000000000..6d5712bb6 --- /dev/null +++ b/bench/btl/README @@ -0,0 +1,143 @@ +Bench Template Library + +**************************************** +Introduction : + +The aim of this project is to compare the performance +of available numerical libraries. The code is designed +as generic and modular as possible. Thus, adding new +numerical libraries or new numerical tests should +require minimal effort. + + +***************************************** + +Installation : + +BTL uses cmake / ctest: + +1 - create a build directory: + + $ mkdir build + $ cd build + +2 - configure: + + $ ccmake .. + +3 - run the bench using ctest: + + $ ctest -V + +You can also run a single bench, e.g.: ctest -V -R eigen + +4 : Analyze the result. different data files (.dat) are produced in each libs directories. + If gnuplot is available, choose a directory name in the data directory to store the results and type + cd data + mkdir my_directory + cp ../libs/*/*.dat my_directory + Build the data utilities in this (data) directory + make + Then you can look the raw data, + go_mean my_directory + or smooth the data first : + smooth_all.sh my_directory + go_mean my_directory_smooth + + +************************************************* + +Files and directories : + + generic_bench : all the bench sources common to all libraries + + actions : sources for different action wrappers (axpy, matrix-matrix product) to be tested. + + libs/* : bench sources specific to each tested libraries. + + machine_dep : directory used to store machine specific Makefile.in + + data : directory used to store gnuplot scripts and data analysis utilities + +************************************************** + +Principles : the code modularity is achieved by defining two concepts : + + ****** Action concept : This is a class defining which kind + of test must be performed (e.g. a matrix_vector_product). + An Action should define the following methods : + + *** Ctor using the size of the problem (matrix or vector size) as an argument + Action action(size); + *** initialize : this method initialize the calculation (e.g. initialize the matrices and vectors arguments) + action.initialize(); + *** calculate : this method actually launch the calculation to be benchmarked + action.calculate; + *** nb_op_base() : this method returns the complexity of the calculate method (allowing the mflops evaluation) + *** name() : this method returns the name of the action (std::string) + + ****** Interface concept : This is a class or namespace defining how to use a given library and + its specific containers (matrix and vector). Up to now an interface should following types + + *** real_type : kind of float to be used (float or double) + *** stl_vector : must correspond to std::vector<real_type> + *** stl_matrix : must correspond to std::vector<stl_vector> + *** gene_vector : the vector type for this interface --> e.g. (real_type *) for the C_interface + *** gene_matrix : the matrix type for this interface --> e.g. (gene_vector *) for the C_interface + + + the following common methods + + *** free_matrix(gene_matrix & A, int N) dealocation of a N sized gene_matrix A + *** free_vector(gene_vector & B) dealocation of a N sized gene_vector B + *** matrix_from_stl(gene_matrix & A, stl_matrix & A_stl) copy the content of an stl_matrix A_stl into a gene_matrix A. + The allocation of A is done in this function. + *** vector_to_stl(gene_vector & B, stl_vector & B_stl) copy the content of an stl_vector B_stl into a gene_vector B. + The allocation of B is done in this function. + *** matrix_to_stl(gene_matrix & A, stl_matrix & A_stl) copy the content of an gene_matrix A into an stl_matrix A_stl. + The size of A_STL must corresponds to the size of A. + *** vector_to_stl(gene_vector & A, stl_vector & A_stl) copy the content of an gene_vector A into an stl_vector A_stl. + The size of B_STL must corresponds to the size of B. + *** copy_matrix(gene_matrix & source, gene_matrix & cible, int N) : copy the content of source in cible. Both source + and cible must be sized NxN. + *** copy_vector(gene_vector & source, gene_vector & cible, int N) : copy the content of source in cible. Both source + and cible must be sized N. + + and the following method corresponding to the action one wants to be benchmarked : + + *** matrix_vector_product(const gene_matrix & A, const gene_vector & B, gene_vector & X, int N) + *** matrix_matrix_product(const gene_matrix & A, const gene_matrix & B, gene_matrix & X, int N) + *** ata_product(const gene_matrix & A, gene_matrix & X, int N) + *** aat_product(const gene_matrix & A, gene_matrix & X, int N) + *** axpy(real coef, const gene_vector & X, gene_vector & Y, int N) + + The bench algorithm (generic_bench/bench.hh) is templated with an action itself templated with + an interface. A typical main.cpp source stored in a given library directory libs/A_LIB + looks like : + + bench< AN_ACTION < AN_INTERFACE > >( 10 , 1000 , 50 ) ; + + this function will produce XY data file containing measured mflops as a function of the size for 50 + sizes between 10 and 10000. + + This algorithm can be adapted by providing a given Perf_Analyzer object which determines how the time + measurements must be done. For example, the X86_Perf_Analyzer use the asm rdtsc function and provides + a very fast and accurate (but less portable) timing method. The default is the Portable_Perf_Analyzer + so + + bench< AN_ACTION < AN_INTERFACE > >( 10 , 1000 , 50 ) ; + + is equivalent to + + bench< Portable_Perf_Analyzer,AN_ACTION < AN_INTERFACE > >( 10 , 1000 , 50 ) ; + + If your system supports it we suggest to use a mixed implementation (X86_perf_Analyzer+Portable_Perf_Analyzer). + replace + bench<Portable_Perf_Analyzer,Action>(size_min,size_max,nb_point); + with + bench<Mixed_Perf_Analyzer,Action>(size_min,size_max,nb_point); + in generic/bench.hh + +. + + + |