INTERROGATIONS: IS THIS NORMAL/EXPECTED?

training/scalar_loss prediction/result are sometimes int (float for res), sometimes tensorflow.python.framework.ops.EagerTensor
we only ever have one NW saved... with training/train_network, since they each erase the previous one because they always have the same number of training steps, so we re-create a new network each time, never improve on the previous one.

MAJOR BUGS/ISSUES IDENTIFIED

training_steps update?
confusion between config.ts & network.ts : config.ts should be fixed (total nb we want) while network.ts should increase (per step we train)
helpers/recurrent_inference : why are we returning 0 for the nw output, shouldn't be : how can we get the reward there? it's necessary.
for the nw.training_steps, should we update them in update_weights or train_network? either way there's always only going to be one nw in the storage...

MINOR BUGS/LIMITATIONS IDENTIFIED

LR in training/train_network is fixed, we should have it decay.

HYPOTHESIS / ANSWER(S) Is train/training called? -> OK, in driver program. Does training save the NW? -> OK, Exploding weights? -> OK. they seem normal (printed for 10 epochs). Is config.training_steps correctly updated or are networks erasing each other? -> PB! Is there a confusion between network.training_steps and config.training_steps? -> PB! Does the agent get some reward other than 0 at some point? -> PB!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bughunt_log.md

bughunt_log.md

Files

bughunt_log.md

Latest commit

History

bughunt_log.md

File metadata and controls