Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finetuning updates to the QnA RAG Demo #154

Open
wants to merge 23 commits into
base: develop
Choose a base branch
from
Open

Conversation

tybritten
Copy link
Collaborator

@tybritten tybritten commented Mar 8, 2024

Provide a clear and concise description of the content changes you're proposing. List all the changes you are making
to the content.

  • Updated QnA RAG Demo Section with the finetuning code for the embedding model as well as a notebook to run through.

Checklist:

  • I have checked that my enhancements are not duplicates of existing content changes or additions.
  • I have tested the changes in a working environment to ensure they function as intended.
  • I have followed the style guide outlined in the contribution guidelines.

Reviewer's Tasks (for maintainers reviewing this PR):

  • Verify that the tutorial functions correctly in a live environment.
  • Verify that the updated content aligns with the style guide in the contribution guidelines.
  • Check for consistency, grammar, and clarity throughout the updated content.
  • Check that the related GitHub issue is up-to-date.

Dimitris Poulopoulos and others added 23 commits February 6, 2024 17:02
Point to the Superset tutorial in the README file.

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Make minor fixes in the README files of the Bike Sharing and Superset
demos and tutorials:

* Bike Sharing: Fix the path pointing to the location of the demo.
* Superset: Fix the last instruction in the procedure section.

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Fix various grammar and syntax errors in the README file.

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Fix grammar and syntax errors in the first Notebook of the Superset
tutorial.

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Fix grammar and syntax errors in the second Notebook.

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Fix grammar and syntax errors in the third Notebook of the Superset
tutorial.

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Signed-off-by: Dimitris Poulopoulos <[email protected]>
Remove the instructions that create a separate conda Python environment
for this tutorial. The Wind Turbine tutorial runs as is in the default
data-science Jupyter image.

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Add a troubleshooting guide that add instructions on how to use a proxy
within a terminal environment, a Notebook, and KServe. This is needed if
the cluster is deployed behind a proxy.

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Add a troubleshooting sections in the README file to account for
the case where EzUA is deployed behind a proxy.

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Document the branching model of the repository on the main README
file.

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Fix the section that defines the environment variables of the
vector store ISVC in the second Notebook.

Closes #123

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Remove the instruction that mentions the `wind-turbine` conda
kernel from the README file, as it is no longer valid.

Instead, use the default Python 3 environment.

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Change the application image version to `0.2.1` and increase the
memory resource to 1 Gi.

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Connect the repository with the internal CI/CD pipeline to automate the
process of building and pushing the docker images to the public registry.

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Add two base images for building a Jupyter Server Docker image:

* A base image that installs any basic utilities and packages.
* A Jupyter image that provides a basic JupyterLab installation.

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Extend the Makefile to build and push the base Jupyter images for the
demos and tutorials.

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Move the Question-Answering demo to a new `rag-demos` directory. This
change marks the beginning of a new demo section based on LLMs and
vector databases.

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Introduce a new GPU variant of the Question-Answering demo, mirroring
its CPU counterpart but leveraging GPUs for both the Embeddings model
and the LLM model Inference Services (ISVCs).

Using KServe with the Triton Inference Server backend, significantly
boosts performance. Triton supports different backends itself, from
simple Python scripts to TensorRT engines.

The LLM ISVC runs the Llama 2 7B variant on the TensorRT-LLM backend.
The Embeddings model ISVC runs BGE-M3, fine-tuned with data from the
EzUA, EzDF, MLDE, and MLDM docs, ensuring optimized response accuracy
and speed.

Signed-off-by: Dimitris Poulopoulos <[email protected]>
Extend the Makefile to build and push the Docker images for the
Question-Answering GPU demo. The images for this demo are:

* An image for the Vector Store component.
* An image for the LLM transformer component.
* An image for the front-end (i.e., Gradio).
* An image for the customized Triton Inference Server.

Signed-off-by: Dimitris Poulopoulos <[email protected]>
@tybritten
Copy link
Collaborator Author

@AlexanderOllman any changes to the notebook for the finetuning?

@dpoulopoulos
Copy link
Collaborator

dpoulopoulos commented Mar 11, 2024

Hey @tybritten,

Thank you for submitting this PR; it's concise and to the point—well done. Here's my feedback. Please make the necessary amendments, so we can proceed with merging:

  • Ensure each commit is a release candidate, meaning the demo should run as expected when any commit is checked out. In your initial commit, you included a few "experiment files" which appear to be non-functional, as they are corrected in your subsequent commit. I recommend merging these two commits into one for coherence and to ensure each commit presents a functional state of the project.

  • Consider the relevance of committing the Notebook separately. Since the Notebook is essential for running the experiment, it might be more practical to include it within the same commit as the related code. If you prefer organizing your contributions into multiple commits, ensure they are logically divided, such as keeping the Dockerfile separate. This approach facilitates targeted amendments, for example, exclusively updating the Dockerfile if necessary and referencing this commit.

  • Before submitting, clear all outputs from the Notebook in accordance with our contribution guidelines, available at: https://github.com/HPEEzmeral/ezua-tutorials/blob/develop/CONTRIBUTING.md. This step is crucial as the outputs of a Notebook can blow up the memory used from an .ipynb. file.

  • Aim for more informative commit messages. Your initial commit introduces six files and a few hundred lines of code without a clear description of their purpose. As it may become challenging to recall the specifics of these files in the future, it is important to provide a summary of the commit's content using an imperative tone. For instance, a descriptive commit message could start with:

    demos/rag: Introduce embeddings fine-tuning
    
    Include the Embeddings fine-tuning experiment files ...
    

@tybritten
Copy link
Collaborator Author

I'm not sure I'm following on the commit content/messages approach. Normally the PR is squashed and merged resulting in all the changes in a single commit.

@dpoulopoulos
Copy link
Collaborator

A PR may consist of varying numbers of commits, depending on the scope of the feature it addresses. For minor bug fixes, a single commit might suffice. However, the introduction of a significant new feature could involve dozens of commits.

However, this is orthogonal to the commit message. Regardless of whether your PR comprises a single commit or multiple, the commit message(s) should always be descriptive, clearly outlining the changes made. Thus, it's entirely feasible to consolidate your changes into a single, well-documented commit this time.

My suggestion is to squash everything into a single commit, document in the commit message what this demo is about, and force push it. Then, we can merge it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants