Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix inference_step #38

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Fireblossom
Copy link

inference_step passes inference=True to model_engine.
However, the __forward__ of the Magma model does not accept this parameter, which will cause an error during training.
I fix it by simply copying the inference code from example_inference.py.

) -> ModelOutput:
if inference is True:
input_embeddings = self.image_prefix(images)
asks = [self.tokenizer.encode('Describe the painting:')] * len(images)
Copy link
Collaborator

@CoEich CoEich Nov 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the hardcoded instruction here, since not all images are paintings. For the purpose of this codepath I would prefer not to use any instruction.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I write this just because the example :)

)
return self.generate(
embeddings = input_embeddings,
max_steps = 6,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think 6 sampling steps might be a bit short in general.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest putting hardcoded asks, max_steps, temperature, and top_k into config.py as a dictionary.
do you think it would be better?

@CoEich
Copy link
Collaborator

CoEich commented Nov 28, 2022

Hi,

thanks for your efforts to contribute to the open source MAGMA code. The codepath for inference during training should indeed be fixed and your help is appreciated :-)

In addition to the comments I added, I would in general prefer not to overload the forward function of the model too much. Maybe you could try to just change the inference_step method to invoke model.generate instead of changing the forward pass, see

def inference_step(config, eval_loader, model_engine):

Thanks again and let me know what you think.

Best,

Constantin

@Fireblossom
Copy link
Author

Fireblossom commented Nov 28, 2022

Hi Constantin,

thank you for your advice. I was going to do the same.

But in practice, I found that deepspeed's model_engine cannot call methods other than forward.
(I'm a beginner in deepspeed, as this is my first time with it, so please point out if I'm wrong)
But if I call the model directly, it may lead to some unexpected errors, like device mismatch.

For the above reasons, I had to add the code into forward.

Best,

Changxu

@CoEich
Copy link
Collaborator

CoEich commented Dec 6, 2022

Ok, I think you can just access the model by model_engine.model, not 100% sure if that always works but maybe give it a try.

Best,

Constantin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants