Skip to content

Latest commit

 

History

History
23 lines (17 loc) · 1.39 KB

bughunt_log.md

File metadata and controls

23 lines (17 loc) · 1.39 KB

INTERROGATIONS: IS THIS NORMAL/EXPECTED?

  • training/scalar_loss prediction/result are sometimes int (float for res), sometimes tensorflow.python.framework.ops.EagerTensor
  • we only ever have one NW saved... with training/train_network, since they each erase the previous one because they always have the same number of training steps, so we re-create a new network each time, never improve on the previous one.

MAJOR BUGS/ISSUES IDENTIFIED

  • training_steps update?
  • confusion between config.ts & network.ts : config.ts should be fixed (total nb we want) while network.ts should increase (per step we train)
  • helpers/recurrent_inference : why are we returning 0 for the nw output, shouldn't be : how can we get the reward there? it's necessary.
  • for the nw.training_steps, should we update them in update_weights or train_network? either way there's always only going to be one nw in the storage...

MINOR BUGS/LIMITATIONS IDENTIFIED

  • LR in training/train_network is fixed, we should have it decay.

HYPOTHESIS / ANSWER(S) Is train/training called? -> OK, in driver program. Does training save the NW? -> OK, Exploding weights? -> OK. they seem normal (printed for 10 epochs). Is config.training_steps correctly updated or are networks erasing each other? -> PB! Is there a confusion between network.training_steps and config.training_steps? -> PB! Does the agent get some reward other than 0 at some point? -> PB!