| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
| |
Delete tf.contrib.kfac. K-FAC in Tensorflow is now its own separate package.
END_PUBLIC
RELNOTES: n/a
Automated rollback of commit 938b9a40787028c58fb548fa6ada8c0dd8180f35
PiperOrigin-RevId: 209813506
|
|
|
|
|
|
|
|
| |
self.test_session() has been deprecated in 9962eb5e84b15e309410071b06c2ed2d6148ed44 as its name confuses readers of the test. Moving to cached_session() instead which is more explicit about:
* the fact that the session may be reused.
* the session is not closed even when doing a "with self.test_session()" statement.
PiperOrigin-RevId: 209700671
|
|
|
|
| |
PiperOrigin-RevId: 208973995
|
|
|
|
| |
PiperOrigin-RevId: 208972923
|
| |
|
|
|
|
| |
PiperOrigin-RevId: 199119904
|
|
|
|
|
|
|
| |
Revert #18413. Too many internal test failures due to the name scope change caused by this change.
Revert #18192. Cannot use re2::StringPiece internally. Need alternative for set call. Will pull and clean this up in a separate change.
PiperOrigin-RevId: 197991247
|
|
|
|
|
|
|
|
| |
instance of tuple. Otherwise this causes "unhashable keys" error when we try to hash.
Also fixed lint error.
PiperOrigin-RevId: 195061425
|
|
|
|
|
|
|
|
|
| |
Adam) is used for the cov matrices. Note this this requires that cov variables, then inv variables, are all updated before the first training update is made. All examples have been modified to do this. NOTE: you *may* have to increase the damping value you use at the start of optimization after this change (or throughout, if you are using a constant value).
- Changed the initial default approximation used for generic registrations to "diagonal"
- Convenience properties for ops and thunks have all been removed, along with "make_ops_and_vars". User should only interface with "make_vars_and_create_op_thunks" (or maybe "create_ops_and_vars_thunks").
PiperOrigin-RevId: 194461623
|
|
|
|
|
|
|
|
| |
- Refactored FisherFactor to use LinearOperator classes that know how to multiply themselves, compute their own trace, etc. This addresses the feature request: b/73356352
- Fixed some problems with FisherEstimator construction
- More careful casting of damping constants before they are used
PiperOrigin-RevId: 194379298
|
|
|
|
| |
PiperOrigin-RevId: 194238853
|
|
|
|
| |
PiperOrigin-RevId: 194031845
|
|
|
|
|
|
|
| |
Add functionality to subsample the extracted image patches based on the number of the outer
products per entry of the covariance matrix.
PiperOrigin-RevId: 193927804
|
|
|
|
| |
PiperOrigin-RevId: 192850372
|
|
|
|
| |
PiperOrigin-RevId: 192756152
|
|
|
|
|
|
| |
As LayerCollections are required to instantiate KfacOptimizer and FisherEstimator, a deprecation warning is printed upon instantiating LayerCollection.
PiperOrigin-RevId: 192671370
|
|
|
|
| |
PiperOrigin-RevId: 191897098
|
|
|
|
|
|
|
|
| |
so the
self._variables will be empty list. So pass a function which returns list of trainable variables to estimator.
PiperOrigin-RevId: 191893084
|
|
|
|
|
|
|
| |
1) Interleave covariance and inverse update ops with training op.
2) Run the inverse and covariance ops on separate dedicated workers.
PiperOrigin-RevId: 191579634
|
|
|
|
|
|
|
|
| |
reshaped to
[batch_size*num_uses, input_size]. `num_uses` should be incremented by one in this case.'
PiperOrigin-RevId: 191456184
|
|
|
|
|
|
| |
This reverts commit 4e108ef30d7cd7ae5e1c550ec5ae27e79b8c6e39.
PiperOrigin-RevId: 191391075
|
|
|
|
| |
PiperOrigin-RevId: 190878279
|
|
|
|
| |
PiperOrigin-RevId: 190699635
|
|
|
|
| |
PiperOrigin-RevId: 190695737
|
|
|
|
|
|
|
|
|
|
| |
scenario. In this strategy we do the cov computations locally on each tower and then sum the results, as opposed to concatenating everything onto a single device. This other strategy can be enabled by setting the global variable TOWER_STRATEGY to "separate" (default value is "concat", which implements the old strategy). We might change this to use "separate" by default if this turns out to be the best default.
- The code and documentation now no longer refer to the towers as computing different "mini-batches", since this was a confusing use of terminology. The best way to think about things is that the combine data over all the towers forms the mini-batch. Note however when factors process multiple towers using the "separate" strategy their batch_size variable will still refer to the amount of data in a single tower.
- Fixed a bug in how the "option 1" and "option 2" RNN Fisher approximations were computed in the multi-tower scenario.
- The "time-folded-into-batch" feature recently added has now changed in terms of what format it uses. Time is now the first dimension before the reshape, not the second, which is consistent with the convention used in other codebases.
PiperOrigin-RevId: 190615398
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
190228434 by A. Unique TensorFlower:
internal change only
--
190215102 by A. Unique TensorFlower:
Remove placement strategy from Fisher estimator and add it in as Mixin.
A new Kfac Optimizer sub class which runs covariance and inverse ops periodically. This should deprecate lazykfac.
--
190211603 by A. Unique TensorFlower:
Internal only
--
PiperOrigin-RevId: 190228434
|
|
|
|
| |
PiperOrigin-RevId: 189945839
|
|
|
|
|
|
|
|
|
| |
with time folded into the batch dimension instead of lists of tensors
- Significant refactoring of RNN classes
- Fixed a bunch of issues in the LayerCollection docstrings, especially around the 'reuse' argument.
PiperOrigin-RevId: 189716331
|
|
|
|
|
|
|
|
| |
layers.
- Some minor refactoring of internal structure in fisher_blocks and layer_collection
PiperOrigin-RevId: 189338874
|
|
|
|
| |
PiperOrigin-RevId: 189258641
|
|
|
|
| |
PiperOrigin-RevId: 189231636
|
|
|
|
|
|
| |
FisherEstimator's constructor. This allows layers and losses to be registered after the FisherEstimator (or KFACOptimizer) has been constructed.
PiperOrigin-RevId: 188889252
|
|
|
|
| |
PiperOrigin-RevId: 188778072
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
powers of the approximate Fisher
- Added multi-tower support to multi/RNN fully connected layers
- All op creation is now done inside functions that explicitly create ops, thus allowing fine control of their placement. One result of this is that we no longer need any colocation statements (and these have been removed)
- Multi-tower computations are now handled using ParitionedTensor class, which appears to be a single tensor to the FisherFactors but actually contains a list of tensors.
- To achieve the above damping values are passed around as special functions that are packaged along with "ids" that can be used to uniquely identify the computation they perform. Topohash might provide a better solution for this in the future.
- Variable creation in the factors is now done via special methods so we can have fine control over where these are placed
- FisherEstimator now has special functions to create ops and variables using different placement strategies (currently: no strategy, round-robin, and as thunks). By default this will use the round-robin strategy and manufacture the usual convenience properties ("inv_update_ops", etc). This default behavior is to preserve backwards compatibility but in the future we should deprecate this and require the user to ask for an explicit strategy.
- LossFunctions no longer make any ops in their constructors. The only make ops when evaluated. LayerCollection maintains a list of tensors/ops which we can colocate LossFunction computations with (typically their inputs)
- LossFunctions no longer support multi-tower/mini-batches directly. Instead LayerCollection maintains a list of these objects, one for each tower. This solution is better since now the loss function related computations can take place exclusively on the corresponding tower.
- All loss functions now support multiple towers/minibatches (via LayerCollection).
- tf.gradients is passed list of loss function values instead of their sum, which will prevent extraneous gradient ops being placed on arbitrary devices. Hopefully with this change and the above one for loss functions all ops associated with gradient computations (for computing stats) will occur completely on the device that defines that part of the graph. e.g. this will do the right thing for multiple towers
- I've also made sure that sensible colocation occurs for the extra ops needed by the curvature_propagation and exact estimation modes.
- Variables and ops made by FisherEstimator are now placed inside of name scopes (based on the name given to FisherEstimator)
- Restored old variable use count tracker implementation, thus fixing the issue with how generic registrations were handled by check_registration().
- Restored interface to FisherEstimator (which was changed in the previous CL).
- Fixed bug in LazyKFacOptimizer: optional/named arguments weren't being passed in properly
- Lots of other minor refactors/improvements
PiperOrigin-RevId: 188310846
|
|
|
|
|
|
| |
model using variable size training data and update damping parameter, add KFACOptimizer.{update_damping}.
PiperOrigin-RevId: 186509305
|
|
|
|
|
|
| |
multiply_inverse}.
PiperOrigin-RevId: 185920837
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change removes the following class:
* `tf.contrib.data.Dataset`.
IF THIS BREAKS YOU: Replace `tf.contrib.data.Dataset` with `tf.data.Dataset`
when constructing a dataset. Note that you may have to modify downstream
transformations to use the core API. See "tensorflow/contrib/data/README.md" for
details of how to update your code to use the core API.
PiperOrigin-RevId: 185197005
|
|
|
|
| |
PiperOrigin-RevId: 183923073
|
|
|
|
| |
PiperOrigin-RevId: 183867014
|
|
|
|
|
|
| |
ConvDiagonalFB blocks.
PiperOrigin-RevId: 183472440
|
|
|
|
|
|
| |
CategoricalLogitsNegativeLogProbLoss.
PiperOrigin-RevId: 182972901
|
|
|
|
|
|
|
|
|
|
| |
would be computed during the call to instantiate_factors (via the "registration" functions) instead of the call to make_inverse_update_ops. This messed up the device placement of these ops and interacted badly with other parts of the code.
Also changed how covariance op creation is done in ConvDiagonalFactor in anticipation of similar problems in the future.
As of this CL, none of the op creation methods will modify the state of the class, and no ops will be created outside of the op creation methods. We should try to follow this convention going forward.
PiperOrigin-RevId: 182265266
|
|
|
|
|
|
| |
constant strings from layer_collection in their respective library modules. This allows consistent development of blocks and factors outside tensorflow.contrib.kfac.
PiperOrigin-RevId: 182197356
|
|
|
|
|
|
|
|
| |
- Removes FisherEstimator.inv_updates_dict. Users should create directly from
FisherEstimator.inv_update_ops.
- Adds (cov|inv)_update_(thunks|ops) to KfacOptimizer.
PiperOrigin-RevId: 182135826
|
|
|
|
| |
PiperOrigin-RevId: 181993953
|
|
|
|
| |
PiperOrigin-RevId: 181689879
|
|
|
|
| |
PiperOrigin-RevId: 181672525
|
|
|
|
|
|
| |
Changing the default value of colocate_gradients_with_ops to True.
PiperOrigin-RevId: 181624864
|
|
|
|
|
|
| |
kfac.loss_functions_lib._allowed_symbols.
PiperOrigin-RevId: 180672608
|
|
|
|
| |
PiperOrigin-RevId: 180536416
|