Skip to content
This repository has been archived by the owner on Oct 27, 2020. It is now read-only.

How to finetune? #1

Open
fanbbbb opened this issue Sep 4, 2019 · 3 comments
Open

How to finetune? #1

fanbbbb opened this issue Sep 4, 2019 · 3 comments

Comments

@fanbbbb
Copy link

fanbbbb commented Sep 4, 2019

I have trained a model for "soccer pdqn", and I want to finetune a new work based on the trained model, what should I do?

@cycraig
Copy link
Owner

cycraig commented Sep 4, 2019

finetune a new work based on the trained model

Hi, are you trying to do transfer learning to apply the trained model to a similar task in the HFO (soccer) environment, or just optimising the hyperparameters for (M)P-DQN on HFO?

@fanbbbb
Copy link
Author

fanbbbb commented Sep 9, 2019

Yes, I have trained a model with 1 offense-agent and 0 defense-npc, and I transfer this model to 1 offense-agent and 1 defense-npc, I modified some layers in pytorch with random initialization. It worked! However, there is a new issue I would like to ask u , I am trying to use this model in a 2 offense-agents and 2 defense-agents work (a multi-agents work), I have no idea about that. Could u please leave me a mail address if u are convenient? Some details need to be consulted, Thanks a lot.

@cycraig
Copy link
Owner

cycraig commented Sep 10, 2019

Nice work 👍

P-DQN in its original form really only considers single, independent agents. There has been some work in multi-agent reinforcement learning with parameterised actions, such as https://arxiv.org/abs/1903.04959 . Your best bet would be to use their algorithm which is designed with multiple agents in mind, or extend P-DQN in a similar fashion.

It's also a bit difficult to transfer models trained using fewer agents on HFO since the state space increases with every agent added to the environment. If you want to discuss this more you can email me at: mpdqn at pm.me

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants