aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/stream_executor/blas.cc
Commit message (Collapse)AuthorAge
* [StreamExecutor] Rename ::perftools::gputools -> ::stream_executor, part 1.Gravatar Justin Lebar2018-04-17
| | | | | | | | | | | | | | | | | | | | | | | | | | Step 1 of re-namespace'ing StreamExecutor into ::stream_executor. This moves everything inside of stream_executor/..., and leaves a namespace alias into ::perftools::gputools. The next steps will clean up users to use the new namespace. This is mostly a mechanical change, but it also includes a bunch of non-mechanical changes that ideally would be split out into separate patches. Unfortunately they all sort of need to be shoved in here for various reasons: - forward declarations need to be in the same namespace as the actual types, so we need to change all forward declarations of StreamExecutor types in this one patch. - Uses of these forward declarations need to be changed to the new namespace (or otherwise we need to add a namespace alias to the relevant header, but this is pretty ugly). - Various initialization code needs to live in StreamExecutor's "real" namespace, so all this needs to be changed. PiperOrigin-RevId: 193256128
* [XLA] FP16 Dot support for the CPU and GPU backends.Gravatar Bixia Zheng2018-02-28
| | | | | | | | | | | | | | | | | | | Extend the stream interface ThenBlasGemmWithAlgorithm to support F16 matrix multiplication with computation type FP32. Extend the stream executor interface DoBlasGemmWithAlgorithm to support F16 GEMM with computation type FP32. Extend the CPU IR emitter to handle F16 Dot instruction, and add F16 matrix multiplication implementation to the CPU runtime. Extend the GPU backend to handle FP16 GEMM Thunk. Replicate the existing matrix multiplication test cases in matrix_ops_simple_test and dot_operation_test for FP16. RELNOTES: PiperOrigin-RevId: 187369731
* Let GetBlasGemmAlgorithms() always return true.Gravatar Yangzihao Wang2017-07-21
| | | | PiperOrigin-RevId: 162748507
* Automated g4 rollback of changelist 162423171Gravatar A. Unique TensorFlower2017-07-18
| | | | PiperOrigin-RevId: 162437318
* Add autotuning code for matmul operator.Gravatar Yangzihao Wang2017-07-18
| | | | | | Currently it is turned off by default. PiperOrigin-RevId: 162423171
* [XLA] [StreamExecutor] Tune GEMMs when possible.Gravatar Justin Lebar2017-03-02
| | | | | | | | | | | | | cublas 8 adds the cublasGemmEx function, which lets you specify an explicit "algorithm" for the computation. This functions as an opaque tuning hint to cublas. This patch adds support for cublasGemmEx to StreamExecutor, and wires up XLA's GemmThunk to use the new function. This patch does not add GEMM autotuning support in TensorFlow proper, only XLA. Change: 149068961
* Update copyright for 3p/tf.Gravatar A. Unique TensorFlower2016-06-02
| | | | Change: 123901292
* TensorFlow: Improve performance of AlexnetGravatar Manjunath Kudlur2015-11-20
| | | | | | | | | | | | | | | | | | | | | | Changes: * error message that refers to removed `DefaultSession` method. * -Wnull-conversion warnings * the "_start_time" attr for recvs when the flag "--brain_enable_scheduling_for_recvs" is set. * typo in tutorial data download progress message. * a typo ("however their installing"=>"however installing"). * typo, rename "TensorFlow Mechanics" to "How To" to be consistent with the website. * a typo ("subtact"=>"subtract"). * protobuf examples in comments in tensorflow::Example.proto. * formula formatting in MNIST beginner tutorial * negative fraction-of-queue-full stats * protobuf inclusion path so that Android demo will build under Blaze. * small typo (moderatly > moderately) * Session.run() to check that tensor arguments come from the session's graph. * another six import * seq2seq typo in bazel command Base CL: 108349164
* TensorFlow: Initial commit of TensorFlow library.Gravatar Manjunath Kudlur2015-11-06
TensorFlow is an open source software library for numerical computation using data flow graphs. Base CL: 107276108