Skip to content

28.01.19

Kolpa edited this page Jan 28, 2019 · 14 revisions

Attending students

Bastian König, Milan Proell, Kolya Opahle, Jona Otholt, Federico Malerba

Attending instructors

Dr. Haoyin Yang, Christian Bartz

Open questions

  • How long should we train the agents? It seems like there is still some progress after a certain amount of steps (e. g. 2,000,000 steps) — where to stop?
    • Normally you say: Stop training if you reach the plateau (if nothing increases anymore)
    • Idea: Decrease learning rate of the Adam optimizer (see Plans for current week)

Recap of last week

  • Tried negative rewards that occur scaled to the amount of steps the agent has already taken with the following values:
    • -0.01
    • -0.02
    • -0.03
  • Tried the trick of enlarging the target bounding boxes' size
    • helped for -0.02
    • didn't seem to make any difference for -0.03
    • Result: Seemed to increase the size of the bounding boxes when the agent pulled the trigger

Plans for current week

  • Try setting the negative reward to -0.04(?)
  • Analyze the performance of the already trained agents and the to-be-trained agents better
    • Take one or more sample image(s) that each agent is run on
    • Produce video files/animations out of those runs in order to evaluate the performance of the agents
  • 👍 Change the IoU threshold that the environment uses to consider a IoU as "matched" (target threshold)
    • Try 0.75
    • Try 0.9
    • Combine with a lower alpha value(?)
  • Decrease the factor that controls how much the agent moves the bounding box based on the amount of steps the agent has taken (Make it do “smaller" changes with its actions with increasing time)
    • (Additional idea: Let the network predict a factor (starting at 1.0) that is multiplied with the distance that the agent is moving the bounding box at each action)
  • Decrease learning rate of the Adam optimizer –> would make the graphs in TensorBoard less noisy (but there will always be noise)
    • Create an extension for the training that allows to change the learning rate of the optimizer from outside, using a CLI (which would allow us to manually decrease the learning rate)
  • Possible points for performance improvement
    • Iterator that iterates over the images
    • Image resizing
    • Image cropping
    • There is a library from NVIDIA, but it has some problems (since its meant for data preparation mostly, has experimental support in Chainer)
  • Decrease the amount of steps that the agent can take during evaluation per image from 200 to 50