aboutsummaryrefslogtreecommitdiffhomepage
path: root/tensorflow/contrib/bigtable
Commit message (Collapse)AuthorAge
* [tf.data] Deprecate `tf.contrib.data` and introduce `tf.data.experimental` ↵Gravatar Derek Murray2018-10-01
| | | | | | | | | | | | | | | | | | | | to replace it. This change prepares `tf.data` for TensorFlow 2.0, where `tf.contrib` will no longer exist. It retains the pre-existing endpoints in `tf.contrib.data` with deprecation warnings. Note there are some exceptions to the move: * Deprecated symbols in `tf.contrib.data` have not been moved to `tf.data.experimental`, because replacements already exist. * `tf.contrib.data.LMDBDataset` has not been moved, because we plan to move it to a SIG-maintained repository. * `tf.contrib.data.assert_element_shape()` has not yet been moved, because it depends on functionality in `tf.contrib`, and it will move in a later change. * `tf.contrib.data.AUTOTUNE` has not yet been moved, because we have not yet determined how to `tf_export()` a Python integer. * The stats-related API endpoints have not yet appeared in a released version of TensorFlow, so these are moved to `tf.data.experimental` without retaining an endpoint in `tf.contrib.data`. In addition, this change includes some build rule and ApiDef refactoring: * Some of the "//third_party/tensorflow/python:training" dependencies had to be split in order to avoid a circular dependency. * The `tf.contrib.stateless` ops now have a private core library for the generated wrappers (and accordingly are hidden in their ApiDef) so that `tf.data.experimental.sample_from_datasets()` can depend on them. PiperOrigin-RevId: 215304249
* Fix leaks of a BigtableTableResource in various Bigtable ops.Gravatar Derek Murray2018-09-26
| | | | PiperOrigin-RevId: 214653279
* [tf.data] Adding a private method for (recursively) tracking dataset inputs.Gravatar Jiri Simsa2018-09-25
| | | | PiperOrigin-RevId: 214495925
* Move from deprecated self.test_session() to self.cached_session().Gravatar A. Unique TensorFlower2018-09-21
| | | | | | | | self.test_session() has been deprecated in 9962eb5e84b15e309410071b06c2ed2d6148ed44 as its name confuses readers of the test. Moving to cached_session() instead which is more explicit about: * the fact that the session may be reused. * the session is not closed even when doing a "with self.test_session()" statement. PiperOrigin-RevId: 213944932
* [tf.data] Move all C++ code inside the `tensorflow::data` namespace.Gravatar Derek Murray2018-09-05
| | | | PiperOrigin-RevId: 211733735
* Provide an alternative method to find gRPC `roots.pem` file, using an ↵Gravatar Misha Brukman2018-08-28
| | | | | | | | environment variable, to avoid having to copy a particular file to `/usr/share`. PiperOrigin-RevId: 210550389
* [tf.data] Internal refactoring of C++ classes and APIs.Gravatar Jiri Simsa2018-08-13
| | | | | | | | | | - replacing `OpKernelContext` with newly introduced `DatasetContext` in `DatasetBase` constructor to make it possible to instantiate `DatasetBase` in places where an instance of `OpKernelContext` is not available - replacing `dataset::MakeIteratorContext(OpKernelContext* ctx)` factory with `IteratorContext(OpKernelContext *ctx)` constructor. - folding `GraphDatasetBase` into `DataseBase` and removing the default implementation of `AsGraphDefInternal`, making it the responsibility of the derived class to implement it to encourage/hint developers to provide serialization logic PiperOrigin-RevId: 208560010
* Use the full product name: Google Cloud BigtableGravatar Misha Brukman2018-08-11
| | | | PiperOrigin-RevId: 208352025
* Permit TensorFlow server to access Cloud Bigtable.Gravatar A. Unique TensorFlower2018-08-10
| | | | PiperOrigin-RevId: 208263100
* Automated rollback of commit 79387568f2860dd25f411a2e3ba764dabb76286dGravatar A. Unique TensorFlower2018-08-10
| | | | PiperOrigin-RevId: 208219138
* Permit TensorFlow server to access Cloud Bigtable.Gravatar A. Unique TensorFlower2018-08-09
| | | | PiperOrigin-RevId: 208146858
* Remove usage of magic-api-link syntax from source files.Gravatar Mark Daoust2018-08-09
| | | | | | | | | | | | | | | | | | | | Back-ticks are now converted to links in the api_docs generator. With the new docs repo we're moving to simplify the docs pipeline, and make everything more readable. By doing this we no longer get test failures for symbols that don't exist (`tf.does_not_exist` will not get a link). There is also no way, not to set custom link text. That's okay. This is the result of the following regex replacement (+ a couple of manual edits.): re: @\{([^$].*?)(\$.+?)?} sub: `\1` Which does the following replacements: "@{tf.symbol}" --> "`tf.symbol`" "@{tf.symbol$link_text}" --> "`tf.symbol`" PiperOrigin-RevId: 208042358
* Merge pull request #21086 from taehoonlee:fix_typosGravatar TensorFlower Gardener2018-08-08
|\ | | | | | | PiperOrigin-RevId: 207988541
* | [tf.data / Bigtable] Update docs and method docstringsGravatar Misha Brukman2018-08-02
| | | | | | | | | | | | | | | | | | * Add link to updating scope on a running VM * Add code formatting and Python syntax highlighting * Clarify kwargs argument formatting * Fix method name in docstring PiperOrigin-RevId: 207204628
| * Fix typosGravatar Taehoon Lee2018-07-24
|/
* [tf.data / Bigtable] Renamed BigTable class to BigtableTable for clarityGravatar Misha Brukman2018-07-22
| | | | | | | This removes the confusion between BigTable and Bigtable naming. Also cleaned up all other uses of BigTable in error messages. PiperOrigin-RevId: 205586899
* [tf.data / Bigtable] Document use of the Cloud Bigtable APIGravatar Brennan Saeta2018-07-21
| | | | PiperOrigin-RevId: 205530581
* [tf.data / Bigtable]: Set AlwaysRetryMutationPolicyGravatar Brennan Saeta2018-07-12
| | | | | | Because the tf.data integration currently doesn't support setting client-side timestamps, using the AlwaysRetryMutationPolicy can make data loading more reliable. (This is safe-ish to do, because when reading TF always read Latest(1), so duplicate writes will not be visible to user programs.) PiperOrigin-RevId: 204331133
* [tf.data / Bigtable] Use builder pattern for client optionsGravatar A. Unique TensorFlower2018-07-11
| | | | PiperOrigin-RevId: 204140674
* [tf.data / Bigtable] Parallel scan Bigtable tablesGravatar Brennan Saeta2018-07-10
| | | | | | | | In order to stream data from Cloud Bigtable at high speed, it's important to use multiple connections to stream simultaneously from multiple tablet servers. This change adds two new methods to the BigTable object to set up a dataset based on the SampleKeys method, and tf.contrib.data.parallel_interleave. Because the keys returned from SampleKeys is not guaranteed to be deterministic (in fact, it can change over time without any new data added to the table), the resulting datasets are not deterministic. (In order to further boost performance, we enable sloppy interleaving.) When comparing the table.scan_* methods vs the table.parallel_scan_* methods for a test workload (based on ImageNet), we see performance gains of over 15x, and over 10x compared to a reasonably tuned GCS input pipeline. PiperOrigin-RevId: 203945580
* [tf.data / Bigtable] Add max_receive_message_size connection parameter.Gravatar Brennan Saeta2018-07-07
| | | | | | When storing images in Cloud Bigtable, the resulting gRPC messages are often larger than the default receive message max size value. This change makes the maximum receive message sizes configurable, and sets a more reasonable default for general TensorFlow use. PiperOrigin-RevId: 203569796
* [tf.data / Bigtable] Clean up BUILD filesGravatar Brennan Saeta2018-07-05
| | | | PiperOrigin-RevId: 203405436
* [tf.data / Bigtable] Support sampling row keys.Gravatar Brennan Saeta2018-07-05
| | | | | | When writing high performance input pipelines, you often need to read from multiple servers in parallel. Being able to sample row keys from a table allows one to easily construct high performance parallel input pipelines from Cloud Bigtable. PiperOrigin-RevId: 203389017
* [tf.data / Bigtable]: Augment test client's SampleKeys() implementation.Gravatar Brennan Saeta2018-07-05
| | | | | | In order to test parallel scanning strategies, we rely on SampleKeys returning a series of values. This change augments the test client to return every other row to facilitate easy testing of the parallel scan code. PiperOrigin-RevId: 203383247
* [tf.data / Bigtable] Refactor BUILD fileGravatar Brennan Saeta2018-07-04
| | | | PiperOrigin-RevId: 203319567
* [tf.data / Bigtable] Log mutation_status errors.Gravatar Brennan Saeta2018-07-03
| | | | | | Some errors returned from the Bigtable client are not proper utf8 strings. As a result, when we construct the errors::Unknown error to return to the client, the resulting error message gets suppressed. In order to ensure that some useful context is always available to users, we log the error message in addition to returning it. PiperOrigin-RevId: 203203543
* [tf.data / Bigtable] Fix server-set timestamp mutations.Gravatar Brennan Saeta2018-07-03
| | | | | | Use SetCell with no timestamp arg when using server-set timestamps instead of using std::chrono::milliseconds timestamp(-1), because this results in a timestamp_micros value of -1000 instead of -1, which causes the server to become unhappy. PiperOrigin-RevId: 203165785
* [Bigtable] Allow the connection_pool_size to be configurableGravatar Brennan Saeta2018-07-03
| | | | | | | If left unset, the default should be far larger than the default 4 the client library uses. PiperOrigin-RevId: 203139678
* Update to latest version of Cloud Bigtable C++ Client.Gravatar A. Unique TensorFlower2018-07-02
| | | | PiperOrigin-RevId: 202986386
* [contrib.bigtable] Clean up buildsGravatar Brennan Saeta2018-06-29
| | | | PiperOrigin-RevId: 202673820
* [tf.data / Bigtable] Initial tf.data Bigtable integrationGravatar Brennan Saeta2018-06-29
This change allows TensorFlow users to stream data directly from Cloud Bigtable into the TensorFlow runtime using tf.data. PiperOrigin-RevId: 202664219