Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Greedy search and beam search #2557

Closed
wants to merge 3 commits into from
Closed

Greedy search and beam search #2557

wants to merge 3 commits into from

Conversation

KexinFeng
Copy link
Contributor

@KexinFeng KexinFeng commented Apr 20, 2023

Description

This PR succeeds PR #2547 and #2509. The model tracing is shown therein.

Benchmarked with huggingface transnformers' output.

Ref. https://huggingface.co/blog/how-to-generate

Demo output

In the demo TestLMSearch.java, we feed in batch sequence input, using right padding with the space token ' ' (id = 220).

["DeepMind Company is",  
 "Memories follow me left and right. I can"]

Output of beam search (numBeam=3, maxLength=50):

'DeepMind Company is      \xa0 a company that has been around for a long time and has been around for a long time. They have been around for a long time and have been around for a long time. They have been'
'DeepMind Company is      \xa0 a company that has been around for a long time. It has been around for a long time and has been around for a long time. It has been around for a long time and has been'
'DeepMind Company is      \xa0 a company that has been around for a long time and has been around for a long time. They have been around for a long time and have been around for a long time. They are a'

"Memories follow me left and right. I can't tell you how many times I've been told that I'm not a good person. I'm not a good person. I'm not a good person. I'm not a good person. I"
"Memories follow me left and right. I can't tell you how many times I've been told that I'm not a good person. I'm not a good person. I'm not a good person. I'm not a good person.\n"
'Memories follow me left and right. I can\'t tell you how many times I\'ve been told that I\'m not a good person. I\'m not a good person. I\'m not a good person. I\'m not a good person."\n'

Output of greedy search (maxLength=50):

'DeepMind Company is      \xa0 a company that has been around for over 20 years. We have been around for over 20 years and have been around for over 20 years. We have been around for over 20 years and have been'

"Memories follow me left and right. I can't remember the last time I saw a girl in a dress. I can't remember the last time I saw a girl in a dress. I can't remember the last time I saw a girl in"

Notes about the GPT2's behaviour with padding and attention mask:

This notes shows that for GPT2, the right padding and left padding can behave very differently, either with attention_mask or without. (The situation with the attention_mask is the most intuitive method). The reason is not totally interpretible yet. But this result will guide the next batching solution.

  1. With attention mask which is set 0 on the padded tokens, and 1 everywhere else, for the input_ids that correpsonds to the above input (the space token ' ' id is 220).

A. Right padding:

input_ids = torch.tensor([[29744, 28478, 5834, 318, 220, 220, 220, 220, 220, 220],
                   [13579, 1749, 1061, 502, 1364, 290, 826, 13, 314, 460]])

Output:

"DeepMind Company is      \xa0 a company that has been around for a long time and has been around for a long time. They have been around for a long time and have been around for a long time. They have been"

"Memories follow me left and right. I can't tell you how many times I've been told that I'm not a good person. I'm not a good person. I'm not a good person. I'm not a good person. I"

B. Left padding

input_ids = torch.tensor([[220, 220, 220, 220, 220, 220, 29744, 28478, 5834, 318],
                   [13579, 1749, 1061, 502, 1364, 290, 826, 13, 314, 460]])
"      DeepMind Company is                                        "

"Memories follow me left and right. I can't tell you how many times I've been told that I'm not a good person. I'm not a good person. I'm not a good person. I'm not a good person. I"
  1. Without any attention mask (i.e. all set to 1), for the input_ids that correpsonds to the above input (the space token ' ' id is 220).

A. Right padding
output:

"DeepMind Company is                                                                                  "

"Memories follow me left and right. I can't tell you how many times I've been told that I'm not a good person. I'm not a good person. I'm not a good person. I'm not a good person. I"

B. Left padding
output =

'      DeepMind Company is a subsidiary of DeepMind Technologies, Inc. DeepMind Technologies, Inc. is a subsidiary of DeepMind Technologies, Inc. is a subsidiary of DeepMind Technologies, Inc. is a subsidiary of Deep'

Memories follow me left and right. I can't tell you how many times I've been told that I'm not a good person. I'm not a good person. I'm not a good person. I'm not a good person. I.

@KexinFeng KexinFeng requested review from zachgk, frankfliu and a team as code owners April 20, 2023 04:55
@KexinFeng KexinFeng closed this Apr 25, 2023
@KexinFeng KexinFeng deleted the greedy_and_beam branch April 25, 2023 06:38
@KexinFeng KexinFeng restored the greedy_and_beam branch April 25, 2023 06:39
@KexinFeng KexinFeng reopened this Apr 25, 2023
@KexinFeng
Copy link
Contributor Author

Merged in #2637

@KexinFeng KexinFeng closed this Jun 21, 2023
@xyang16 xyang16 deleted the greedy_and_beam branch October 4, 2023 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant