Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Upload Guide #1847

Merged
merged 23 commits into from
May 2, 2024
Merged

Add Upload Guide #1847

merged 23 commits into from
May 2, 2024

Conversation

SamanehSaadat
Copy link
Member

This PR adds a guide to show how to upload to Kaggle and Hugging Face.

Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Left some initial comments?

guides/keras_nlp/upload.py Outdated Show resolved Hide resolved
guides/keras_nlp/upload.py Outdated Show resolved Hide resolved
guides/keras_nlp/upload.py Outdated Show resolved Hide resolved
guides/keras_nlp/upload.py Outdated Show resolved Hide resolved
guides/keras_nlp/upload.py Show resolved Hide resolved
guides/keras_nlp/upload.py Show resolved Hide resolved
guides/keras_nlp/upload.py Outdated Show resolved Hide resolved
guides/keras_nlp/upload.py Outdated Show resolved Hide resolved
guides/keras_nlp/upload.py Outdated Show resolved Hide resolved
guides/keras_nlp/upload.py Outdated Show resolved Hide resolved
Copy link
Member Author

@SamanehSaadat SamanehSaadat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review, Matt!

guides/keras_nlp/upload.py Outdated Show resolved Hide resolved
guides/keras_nlp/upload.py Outdated Show resolved Hide resolved
guides/keras_nlp/upload.py Outdated Show resolved Hide resolved
guides/keras_nlp/upload.py Outdated Show resolved Hide resolved
guides/keras_nlp/upload.py Outdated Show resolved Hide resolved
guides/keras_nlp/upload.py Show resolved Hide resolved
guides/keras_nlp/upload.py Outdated Show resolved Hide resolved
guides/keras_nlp/upload.py Outdated Show resolved Hide resolved
guides/keras_nlp/upload.py Outdated Show resolved Hide resolved
@SamanehSaadat SamanehSaadat marked this pull request as ready for review April 30, 2024 20:32
Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

guides/keras_nlp/upload.py Outdated Show resolved Hide resolved
guides/keras_nlp/upload.py Outdated Show resolved Hide resolved
guides/keras_nlp/upload.py Show resolved Hide resolved
guides/keras_nlp/upload.py Outdated Show resolved Hide resolved
guides/keras_nlp/upload.py Outdated Show resolved Hide resolved
"""

"""shell
pip install -q --upgrade keras-nlp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might want to add huggingface-hub here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! Done!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, could do this on one line for brevity

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

guides/keras_nlp/upload.py Outdated Show resolved Hide resolved

# Load a user uploaded Classifier from Kaggle Models.
classifier = keras_nlp.models.Classifier.from_preset(
f"kaggle://{kaggle_username}/bert/keras/finetuned_bert"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will this work when running a colab? don't we need a delay before load?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! I think for creating the guide I should have started from notebook rather than the .py file to catch these kinds of bugs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add a comment asking the user to make sure the model is uploaded before attempting to load the model.

guides/keras_nlp/upload.py Outdated Show resolved Hide resolved
guides/keras_nlp/upload.py Outdated Show resolved Hide resolved
Copy link
Member

@mattdangerw mattdangerw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

"""

"""shell
pip install -q --upgrade keras-nlp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, could do this on one line for brevity

guides/keras_nlp/upload.py Show resolved Hide resolved
causal_lm = keras_nlp.models.CausalLM.from_preset(preset_dir)

"""
You can also load the `Backbone` and `Tokenizer` objects from this preset directory.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keras_nlp.models.Backbone and keras_nlp.models.Tokenizer

with backticks. this will trigger auto linking to the docs pages for these classes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

"""

To upload a model we can use `keras_nlp.upload_preset(uri, preset_dir)` API where `uri` has the format of
`kaggle://<KAGGLE_USERNAME>/<MODEL>/<FRAMEWORK>/<VARIATION>` for uploading to Kaggle and `preset_dir` is the directory that the model is saved in.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that for Keras models, the <FRAMEWORK> should always be keras.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! Replaced it with Keras!

guides/keras_nlp/upload.py Outdated Show resolved Hide resolved

classifier = keras_nlp.models.Classifier.from_preset(
f"kaggle://{kaggle_username}/bert/keras/bert_tiny_imdb"
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a follow up PR, and probably is not too urgent, but we might want to add an "advanced" section here on saving a low-level Backbone and Tokenizer. I'm not sure what the best training setup to show there is.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! We'll add this later!

)

# Upload to Hugging Face.
keras_nlp.upload_preset(f"hf://{hf_username}/gpt2_imdb", preset_dir)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattdangerw I added back the HF upload because it can create the delay that we need for the model to be uploaded on Kaggle :D


Running the following uploads the model that is saved in `preset_dir` to Kaggle:
"""
kaggle_username = os.getenv("KAGGLE_USERNAME") # TODO: Assign username.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it to this again to make the autogen run. Kaggle team will have a new release tomorrow with whoami. I'll update this tomorrow.

Copy link
Collaborator

@pcoet pcoet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff!

# Introduction

Fine-tuning a machine learning model can yield impressive results for specific tasks.
Uploading your fine-tuned model to a model hub allow you to share it with the broader community.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"hub allow" -> "hub allows"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Thanks!

causal_lm.save_to_preset(preset_dir)

"""
Let's see what are the files what are the saved files.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: "Let's see the saved files."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it to your suggestion! Thanks!

"""
### Load a Locally Saved Model

A model that is saved to a local preset, can be loaded using `from_preset`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"preset, can" -> "preset can"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

## Upload the Model to a Model Hub

After saving a preset to a directory, this directory can be uploaded to a model hub such as Kaggle or Hugging Face directly from the KerasNLP library.
To upload the model to Kaggle, the URI should start with `kaggle://` and to upload to Hugging Face, it should start with `hf://`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Is the URI format a requirement? If so, say "... the URI must start..." instead of "should".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is! Replaced "should" with "must"!


"""
To upload a model to Kaggle, first, we need to authenticate with Kaggle.
This can by one of the followings:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"by one of the followings:" -> "in one of the following ways:"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

2. Provide a local `~/.kaggle/kaggle.json`.
3. Call `kagglehub.login()`.

Let's make sure we are logged in before coninuing.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"coninuing" -> "continuing"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!


"""
To upload a model to Hugging Face, first, we need to authenticate with Hugging Face.
This can by one of the followings:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous suggestion.

1. Set environment variables `HF_USERNAME` and `HF_TOKEN`.
2. Call `huggingface_hub.notebook_login()`.

Let's make sure we are logged in before coninuing.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"coninuing" -> "continuing"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Copy link
Member Author

@SamanehSaadat SamanehSaadat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review, David!

# Introduction

Fine-tuning a machine learning model can yield impressive results for specific tasks.
Uploading your fine-tuned model to a model hub allow you to share it with the broader community.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Thanks!

causal_lm.save_to_preset(preset_dir)

"""
Let's see what are the files what are the saved files.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it to your suggestion! Thanks!

"""
### Load a Locally Saved Model

A model that is saved to a local preset, can be loaded using `from_preset`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

## Upload the Model to a Model Hub

After saving a preset to a directory, this directory can be uploaded to a model hub such as Kaggle or Hugging Face directly from the KerasNLP library.
To upload the model to Kaggle, the URI should start with `kaggle://` and to upload to Hugging Face, it should start with `hf://`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is! Replaced "should" with "must"!


"""
To upload a model to Kaggle, first, we need to authenticate with Kaggle.
This can by one of the followings:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

2. Provide a local `~/.kaggle/kaggle.json`.
3. Call `kagglehub.login()`.

Let's make sure we are logged in before coninuing.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

1. Set environment variables `HF_USERNAME` and `HF_TOKEN`.
2. Call `huggingface_hub.notebook_login()`.

Let's make sure we are logged in before coninuing.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@pcoet pcoet merged commit ac32100 into keras-team:master May 2, 2024
3 checks passed
@SamanehSaadat SamanehSaadat deleted the upload-guide branch May 3, 2024 02:50
sitamgithub-MSIT pushed a commit to sitamgithub-MSIT/keras-io that referenced this pull request May 30, 2024
* Upload guide.

* KerasNLP upload guide.

* Address reviews.

* Add classifier example.

* Kaggle Hub --> Kaggle Models.

* Add model loading.

* Replace the toy dataset with IMDB dataset.

* Adress reviews.

* Some final fixes to make autogen run successful.

* Fix classifier name in HF upload.

* Reduce batch size.

* Convert the code for loading to markdown code block.

* Get kaggle username from kagglehub.whoami().

* Run black.

* Add notebook and markdown.

* Add the guide path.

* Address reivews.

* Update notebook and markdown files.

* Remove upload progress bars from the markdown file.

* Remove fine tuning progress bars from the markdown file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants