## Profile Model Architecture * [Profile Model Parameters](#profile-model-parameters) * [Profile Model Float Operations](#profile-model-float-operations) ### Profile Model Parameters Notes: `VariableV2` operation type might contain variables created by TensorFlow implicitly. User normally don't want to count them as "model capacity". We can use customized operation type to select a subset of variables. For example `_trainable_variables` is created automatically by tfprof Python API. User can also define customized operation type. ``` # parameters are created by operation type 'VariableV2' (For older model, # it's 'Variable'). scope view is usually suitable in this case. tfprof> scope -account_type_regexes VariableV2 -max_depth 4 -select params _TFProfRoot (--/930.58k params) global_step (1/1 params) init/init_conv/DW (3x3x3x16, 432/864 params) pool_logit/DW (64x10, 640/1.28k params) pool_logit/DW/Momentum (64x10, 640/640 params) pool_logit/biases (10, 10/20 params) pool_logit/biases/Momentum (10, 10/10 params) unit_last/final_bn/beta (64, 64/128 params) unit_last/final_bn/gamma (64, 64/128 params) unit_last/final_bn/moving_mean (64, 64/64 params) unit_last/final_bn/moving_variance (64, 64/64 params) # The Python API profiles tf.trainable_variables() instead of VariableV2. # # By default, it's printed to stdout. User can update options['output'] # to write to file. The result is always returned as a proto buffer. param_stats = tf.profiler.profile( tf.get_default_graph(), options=tf.profiler.ProfileOptionBuilder .trainable_variables_parameter()) sys.stdout.write('total_params: %d\n' % param_stats.total_parameters) ``` ### Profile Model Float Operations #### Caveats For an operation to have float operation statistics: * It must have `RegisterStatistics('flops')` defined in TensorFlow. tfprof uses the definition to calculate float operations. Contributions are welcomed. * It must have known "shape" information for RegisterStatistics('flops') to calculate the statistics. It is suggested to pass in `-run_meta_path` if shape is only known during runtime. tfprof can fill in the missing shape with the runtime shape information from RunMetadata. Hence, it is suggested to use `-account_displayed_op_only` option so that you know the statistics are only for the operations printed out. * If no RunMetadata is provided, tfprof counts float_ops of each graph node once, even if it is defined in a tf.while_loop. This is because tfprof doesn't know statically how many times each graph node is run. If RunMetadata is provided, tfprof calculates float_ops as float_ops * run_count. ```python # To profile float opertions in commandline, you need to pass --graph_path # and --op_log_path. tfprof> scope -min_float_ops 1 -select float_ops -account_displayed_op_only node name | # float_ops _TFProfRoot (--/17.63b flops) gradients/pool_logit/xw_plus_b/MatMul_grad/MatMul (163.84k/163.84k flops) gradients/pool_logit/xw_plus_b/MatMul_grad/MatMul_1 (163.84k/163.84k flops) init/init_conv/Conv2D (113.25m/113.25m flops) pool_logit/xw_plus_b (1.28k/165.12k flops) pool_logit/xw_plus_b/MatMul (163.84k/163.84k flops) unit_1_0/sub1/conv1/Conv2D (603.98m/603.98m flops) unit_1_0/sub2/conv2/Conv2D (603.98m/603.98m flops) unit_1_1/sub1/conv1/Conv2D (603.98m/603.98m flops) unit_1_1/sub2/conv2/Conv2D (603.98m/603.98m flops) # Some might prefer op view that aggregate by operation type. tfprof> op -min_float_ops 1 -select float_ops -account_displayed_op_only -order_by float_ops node name | # float_ops Conv2D 17.63b float_ops (100.00%, 100.00%) MatMul 491.52k float_ops (0.00%, 0.00%) BiasAdd 1.28k float_ops (0.00%, 0.00%) # You can also do that in Python API. tf.profiler.profile( tf.get_default_graph(), options=tf.profiler.ProfileOptionBuilder.float_operation()) ```