| Commit message (Collapse) | Author | Age |
|
|
|
|
| |
Move the GPU-neutral code to common_runtime.
Change: 117591254
|
|
|
|
| |
Change: 117590857
|
|
|
|
| |
Change: 117590840
|
|
|
|
|
|
|
| |
third_party/eigen3 copy
to being part of TF, add tests."
Change: 117587217
|
|
|
|
|
|
| |
This is a generalization of `tf.{range,slice,string}_input_producer()`
that supports arbitrary types and shapes of input. Fixes #486.
Change: 117583214
|
|
|
|
|
|
|
|
|
| |
This test is flakey with any reasonable finite timeout, due
to reliance on a separate thread to take action. In the unlikely
event that the underlying mechanism breaks without breaking
any of the remaining tests in this file, TF regression tests
involving GPUs should break.
Change: 117571507
|
|
|
|
| |
Change: 117570343
|
|
|
|
| |
Change: 117570066
|
|
|
|
| |
Change: 117564348
|
|
|
|
| |
Change: 117559940
|
|
|
|
| |
Change: 117559932
|
|
|
|
| |
Change: 117557479
|
|
|
|
|
|
|
| |
runner.
Outputs proto strings in a way similar to the reporter.cc in tensorflow/core/util/
Change: 117556944
|
|
|
|
|
| |
independently.
Change: 117555238
|
|
|
|
|
|
|
|
|
|
|
|
| |
Benchmark Time(ns) CPU(ns) Iterations
BM_ConvFloatDepthwiseFwdGPU_conv0 4800416 4937895 141 32.7G items/s 32_112_112_3_8_24_3_3_1_2_gpu
BM_ConvFloatDepthwiseFwdGPU_conv1 13550072 13922813 100 30.9G items/s 32_112_112_64_1_64_3_3_1_2_gpu
BM_ConvFloatDepthwiseFwdGPU_conv2 7032385 7324553 100 29.4G items/s 32_56_56_128_1_128_3_3_1_2_gpu
BM_ConvFloatDepthwiseFwdGPU_conv3 2285033 2425335 228 22.2G items/s 32_56_56_128_1_128_3_3_2_2_gpu
BM_ConvFloatDepthwiseFwdGPU_conv4 1743948 1858093 359 29.0G items/s 32_28_28_128_1_128_3_3_1_2_gpu
BM_ConvFloatDepthwiseFwdGPU_conv5 1784560 1897147 320 28.4G items/s 32_14_14_512_1_512_3_3_1_2_gpu
BM_ConvFloatDepthwiseFwdGPU_conv6 971179 1044185 562 25.8G items/s 32_7_7_1024_1_1024_3_3_1_2_gpu
Change: 117553964
|
|
|
|
|
|
|
|
|
| |
`graph` argument.
Keep the now deprecated `graph_def` argument for backward compatibility.
This allows us to add information to the graph (such as tensor shapes and types) before serializing it to the events file, which results in the user automatically getting that information in the graph visualizer.
Change: 117546499
|
|
|
|
| |
Change: 117545997
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See README.md for detailed descriptions of the usage of the tools and tests in this changeset.
Three modes of testing are supported:
1) Launch a local Kubernetes (k8s) cluster and run the test suites on it
(See local_test.sh)
2) Launch a remote k8s cluster on Google Container Engine (GKE) and run the test suite on it
(See remote_test.sh)
3) Run the test suite on an existing k8s TensorFlow cluster
(Also see remote_test.sh)
Take the remote test for example, the following steps are performed:
1) Builds a Docker image with gcloud and Kubernetes tools, and the latest TensorFlow pip installed (see Dockerfile)
2) Launches a Docker container based on the said image (see test_distributed.sh)
3) From within the image, authenticate the gcloud user (with credentials files mapped from outside the container), configer the k8s cluster and launch a new k8s container cluster for TensorFlow workers
4) Generate a k8s (yaml) config file and user this yaml file to create a TensorFlow worker cluster consisting of a certian number of parameter servers (ps) and workers. The workers are exposed as external services with public IPs (see dist_test.sh)
5) Run a simple softmax MNIST model on multiple workers, with the model weights and biases located on the ps nodes. Train the models in parallel and observe the final validation cross entropy (see dist_mnist_test.sh)
Change: 117543657
|
|
|
|
| |
Change: 117523151
|
|
|
|
| |
Change: 117520810
|
|
|
|
|
|
|
| |
copy
to being part of TF, add tests."
Change: 117519243
|
|
|
|
| |
Change: 117518926
|
|
|
|
|
|
|
|
|
|
|
| |
separate struct that is shared by all of the image resizers.
Normalizes the error checking across all of the resizers.
Also added a max size check to nearest_neighbor - because of
the floats, it starts to produce bad results after 2^24px
in either direction. Not that anyone does that, but it's good
to be precise about it.
Change: 117516271
|
|
|
|
|
|
|
| |
Android demo, for performance analysis.
Enable by hardcoding kSaveStepStats to true or passing "--copt -DSAVE_STEP_STATS" to bazel build.
Change: 117512949
|
|
|
|
|
| |
to being part of TF, add tests.
Change: 117509710
|
|
|
|
| |
Change: 117506296
|
|
|
|
| |
Change: 117505457
|
|
|
|
|
|
| |
embedded components.
Change: 117504934
|
|
|
|
| |
Change: 117504830
|
|
|
|
|
|
| |
made to aid debugging. This instantly answers the question: did I specify it wrong or is the device not found?
Change: 117493711
|
|
|
|
|
| |
Also fixed compilation issues with cuda devices that support the compute model 5.3
Change: 117493644
|
|
|
|
| |
Change: 117493386
|
|
|
|
|
|
| |
added some code cleanup).
Change: 117488572
|
|
|
|
|
|
| |
not rely on cell.input_size.
Change: 117484994
|
|
|
|
|
|
| |
snag.
Change: 117484454
|
|
|
|
|
|
| |
real-time, changing data.
Change: 117483893
|
|
|
|
| |
Change: 117483092
|
|
|
|
| |
Change: 117482953
|
|
|
|
| |
Change: 117475266
|
|
|
|
| |
Change: 117471008
|
|
|
|
| |
Change: 117456435
|
|
|
|
|
|
|
|
|
| |
registered by experimental devices.
Right now tensorflow/core/kernels explicitly depends on all Eigen devices that might want to implement any of the templated Eigen Ops. This is because the template classes that need to be specialized are defined in .cc files, so the specializations themselves have to appear there too. Moving the classes to .h files allows us to use arbitrary Eigen devices defined outside of tensorflow/core, which fits better with the intent behind core/kernels.
Over time more kernels may need to be refactored this way for the same reason.
Change: 117452814
|
|
|
|
| |
Change: 117444098
|
|
|
|
| |
Change: 117420208
|
|
|
|
|
|
| |
minimize method.
Change: 117401811
|
|
|
|
| |
Change: 117394750
|
|
|
|
| |
Change: 117392494
|
|
|
|
| |
Change: 117387358
|
|
|
|
| |
Change: 117384840
|
|
|
|
| |
Change: 117384554
|