Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pull request for the checkpoint conversion of BERT(486) #761

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

vulkomilev
Copy link

No description provided.

@abheesht17 abheesht17 self-requested a review February 21, 2023 03:14
Copy link
Collaborator

@abheesht17 abheesht17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vulkomilev, I've submitted a quick review. Please take a look. I had a few questions/suggestions:

tools/checkpoint_conversion/convert_bert.py Outdated Show resolved Hide resolved
tools/checkpoint_conversion/convert_bert.py Outdated Show resolved Hide resolved
tools/checkpoint_conversion/convert_bert.py Outdated Show resolved Hide resolved
tools/checkpoint_conversion/convert_bert.py Outdated Show resolved Hide resolved
tools/checkpoint_conversion/convert_bert.py Outdated Show resolved Hide resolved
tools/checkpoint_conversion/convert_bert.py Outdated Show resolved Hide resolved
tools/checkpoint_conversion/convert_bert.py Outdated Show resolved Hide resolved
return keras_nlp_output


def main(_):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should call check_outputs() in main.

tools/checkpoint_conversion/convert_bert.py Outdated Show resolved Hide resolved
return vocab_path, checkpoint_path, config_path, weights, model


def convert_checkpoints(preset, weights, model):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The if...elif...else logic still looks a bit complicated. Instead of using two sources - TF MG and original BERT repo, we're planning to just use the original BERT repo. Might make this whole block simpler. Let me do that for you!

@mattdangerw
Copy link
Member

@vulkomilev it looks like a lot of the comments above were resolved without being addressed. Did you mean to push a change to this branch?

@vulkomilev
Copy link
Author

I still haven't pushed the changes will do in about an hour

return vocab_path, checkpoint_path, config_path


def convert_checkpoints(preset,checkpoint_path,config_dict):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure to run our format script, this does not look like it has been black formatted in many place. (contributor guide has instructions)

model = keras_nlp.models.BertBackbone.from_preset("bert_tiny_en_uncased",
load_weights=True) # keras_nlp.models.BertBase(vocabulary_size=VOCAB_SIZE)
model.summary()
if preset in ['bert_base_en_uncased', 'bert_base_en']:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should need all these if cases based on preset. Why do we need to add all these?

It's probably easiest to write this for checkpoints from a single source. The official BERT repo has all checkpoints we need. So you could model on https://github.com/keras-team/keras-nlp/blob/master/tools/checkpoint_conversion/bert_tiny_uncased_en.ipynb, for example.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattdangerw - I will push some changes later today, will use ckpts from a single source

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abheesht17 any update on this?

FLAGS = flags.FLAGS

PRESET_MAP = {
"bert_base_cased": {'base':"roberta.base",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these big dicts actually used anywhere? They don't seem to be.

What you probably need is a mapping from our preset names to checkpoint files, and that's it!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they are used in the configuration of the model . Where I can find the list of checkpoint files to map them?

@abheesht17
Copy link
Collaborator

@vulkomilev, do you mind if I push changes on top of yours? There are some modifications to make here, especially w.r.t. the single source BERT ckpts.

@vulkomilev
Copy link
Author

yeah no problem

@vulkomilev
Copy link
Author

Hi @abheesht17 I am encountering error with the conversion .What is the dimension of the last layer for bert_base_uncased because for hugging face it is 768

@vulkomilev
Copy link
Author

bump

@vulkomilev
Copy link
Author

made a new release but I have problem with the shapes can some one check it ?

@vulkomilev
Copy link
Author

@mattdangerw @ abheesht17 bump

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants