blob: 0ff4d2ee0041ee142c65a9975b55c89387612a26 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
|
# Performance
Performance is often a significant issue when training a machine learning
model. This section explains various ways to optimize performance. Start
your investigation with the following guide:
* @{$performance_guide$Performance}, which contains a collection of best
practices for optimizing your TensorFlow code.
XLA (Accelerated Linear Algebra) is an experimental compiler for linear
algebra that optimizes TensorFlow computations. The following guides explore
XLA:
* @{$xla$XLA Overview}, which introduces XLA.
* @{$broadcasting$Broadcasting Semantics}, which describes XLA's
broadcasting semantics.
* @{$developing_new_backend$Developing a new back end for XLA}, which
explains how to re-target TensorFlow in order to optimize the performance
of the computational graph for particular hardware.
* @{$jit$Using JIT Compilation}, which describes the XLA JIT compiler that
compiles and runs parts of TensorFlow graphs via XLA in order to optimize
performance.
* @{$operation_semantics$Operation Semantics}, which is a reference manual
describing the semantics of operations in the `ComputationBuilder`
interface.
* @{$shapes$Shapes and Layout}, which details the `Shape` protocol buffer.
* @{$tfcompile$Using AOT compilation}, which explains `tfcompile`, a
standalone tool that compiles TensorFlow graphs into executable code in
order to optimize performance.
And finally, we offer the following guide:
* @{$quantization$How to Quantize Neural Networks with TensorFlow}, which
can explains how to use quantization to reduce model size, both in storage
and at runtime. Quantization can improve performance, especially on
mobile hardware.
|