Models will always be initialized without dropout layers in self-tuning ruleset #753

georgedahl · 2024-04-04T01:44:29Z

In submission_runner.py, if we are in the self-tuning rules, the hyperparameters argument to train_once will always be None.

Then in this code snippet

    dropout_rate = None
    aux_dropout_rate = None
    if hasattr(hyperparameters, 'dropout_rate'):
      dropout_rate = hyperparameters.dropout_rate
    if hasattr(hyperparameters, 'aux_dropout_rate'):
      aux_dropout_rate = hyperparameters.aux_dropout_rate
    model_params, model_state = workload.init_model_fn(
        model_init_rng, dropout_rate, aux_dropout_rate)

workload.init_model_fn will always get None for dropout_rate and aux_dropout_rate, so Dropout layers won't ever be added to the model.

Although submissions could call workload.init_model_fn again themselves to make use of its side effect of setting workload._model, this is awkward and also challenging for workloads near the memory limit since it involves superfluously reconstructing model_params again on device.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Models will always be initialized without dropout layers in self-tuning ruleset #753

Models will always be initialized without dropout layers in self-tuning ruleset #753

georgedahl commented Apr 4, 2024

Models will always be initialized without dropout layers in self-tuning ruleset #753

Models will always be initialized without dropout layers in self-tuning ruleset #753

Comments

georgedahl commented Apr 4, 2024