Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

some issues adding packages/models for the b2ai:voice project #2

Open
satra opened this issue Apr 13, 2024 · 6 comments
Open

some issues adding packages/models for the b2ai:voice project #2

satra opened this issue Apr 13, 2024 · 6 comments

Comments

@satra
Copy link
Contributor

satra commented Apr 13, 2024

i believe permissions of /opt/conda are off with the python installation in the core cpu-root package, but i couldn't determine exactly what as the Dockerfile for the root package is not directly available (i could see the steps in quay and likely has to do with some of the chowning and likely requires chowning /opt/conda to belong to $NB_USER as well.).

how to replicate:

docker run -it --rm -p 8888:8888 quay.io/ohsu-comp-bio/bridge2ai-jupyter:cpu-root

in a jupyterlab terminal do: pip install opensmile (this will error out):

ERROR: Could not install packages due to an OSError: [Errno 13] Permission denied: '/opt/conda/lib/python3.10/site-packages/tests/test_remix.py'
Consider using the --user option or check the permissions.

indeed it seems like a permissions issue as the following two options both succeed:

  1. as root:
docker run -it --rm -p 8888:8888 --user root --entrypoint bash quay.io/ohsu-comp-bio/bridge2ai-jupyter:cpu-root

and then pip install opensmile

  1. as jovyan but in an alternate conda environment:
mamba create -n py310 python=3.10 pip
mamba init
source ~/.bashrc
mamba activate py310
pip install opensmile

if the issue can be fixed in the core image, that would be great, otherwise i'll send in a PR to create a separate environment. in this environment, pip install b2aiprep ipympl huggingface_hub[cli] would be sufficient to install most packages. in addition, we will want to download some models from huggingface. assuming that the home directory is not remounted when running, otherwise replace model dir to a different location.

export MODEL_DIR=${HOME}/models
huggingface-cli download --local-dir ${MODEL_DIR}/speechbrain/spkrec-ecapa-voxceleb speechbrain/spkrec-ecapa-voxceleb --cache-dir ${MODEL_DIR}/cache
huggingface-cli download --local-dir ${MODEL_DIR}/openai/whisper-base openai/whisper-base --cache-dir ${MODEL_DIR}/cache

downloading models can start adding size easily to the image, so it may be useful to provide a shared space from where people can use models if downloads are not allowed.

cuda installs

for pytorch and tensorflow, both come with pip installable cuda packages, hence the cuda install in gpu-slim may not be required, unless other non-python packages depend on it.

@quinnwai
Copy link
Member

quinnwai commented Apr 15, 2024

Hi Satra, thanks for pointing this out! Just to confirm, you guys will be using one of these images as a base image to push your own to Dockerhub?

If so, let me try and tackle each point...

  1. Pip download as non-root user (conda issue)

To my understanding of our use case, all the package installation should occur in the setup while as root. I briefly tested that you can create a conda environment with opensmile installed by adding it to the Dockerfile for any of the restricted-gpu images in the repo...

FROM quay.io/ohsu-comp-bio/bridge2ai-jupyter:cpu-root

# misc setup specific to image
...

# create environment with tensorflow and pytorch
RUN conda create --name env
SHELL ["conda", "run", "-n", "env", "/bin/bash", "-c"]
RUN pip install opensmile
# RUN pip install tensorflow==2.14.0
# RUN pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
RUN conda init bash
RUN exec bash

# return to non-root user
USER $NB_UID
WORKDIR /home/$NB_USER

Would creating a conda environment within the Docker environment work for your requirements? If not, I can dig deeper into the permissions issues. The cpu-root Dockerfile is also up to to give you more context if needed.

  1. Downloading HuggingFace models

Since these models seem relatively small (<1 GB), feel free to add that to your image as well, as once the container is created, the Jupyter labs environment for the open house user will be in a secure enclave setting and should not accessible to HuggingFace.

  1. Cuda Installs

If that is the case, feel free to just build on top of cpu-root instead of one of the restricted-gpu images! Other than that, jsut make sure to add the user change to the end of your Dockerfile to make sure the open house user only only has non-root access...

# return to non-root user
USER $NB_UID
WORKDIR /home/$NB_USER

Hope that works for your requirements and feel free to reach out again if I missed anything.

@quinnwai
Copy link
Member

quinnwai commented Apr 16, 2024

Hi, following up on how to download opensmile, let me know if this works for you:

conda create --name test-env
conda activate test-env
conda install python=3.10
pip install opensmile

In my initial testing, this allowed me to point the python to /opt/conda/envs/test-env instead of /opt/conda/lib/ so I didn't run into the error.

@satra
Copy link
Contributor Author

satra commented Apr 16, 2024

@quinnwai thank you. most of the time we don't care if people can overwrite the base conda environment within the instance, hence i wasn't expecting it to be "write" protected.

regarding container, we don't have to push our own image. we will do whatever is most useful to you. if you are planning to have a single environment, then just the following commands would be sufficient i believe for our needs:

RUN pip install --no-cache-dir b2aiprep ipympl huggingface_hub[cli]
RUN MODEL_DIR=${HOME}/models && \
   huggingface-cli download --local-dir ${MODEL_DIR}/speechbrain/spkrec-ecapa-voxceleb speechbrain/spkrec-ecapa-voxceleb --cache-dir ${MODEL_DIR}/cache && \
   huggingface-cli download --local-dir ${MODEL_DIR}/openai/whisper-base openai/whisper-base --cache-dir ${MODEL_DIR}/cache

if the image is run on a node/instance with nvidia drivers installed, cuda should be usable. just a note that by default the above will install cuda 12, hence if the node drivers do not support cuda 12, we should force the cuda 11 packages

@satra
Copy link
Contributor Author

satra commented Apr 16, 2024

also i'm just seeing your latest message. if you would prefer creating a new environment for the voice project, i would suggest using python 3.11 (it's significantly better than 3.10 on a bunch of multiprocessing related issues).

@quinnwai
Copy link
Member

@satra Thanks for the response! That is true, it makes sense that we shouldn't need have permission restrictions on the user in their own personal VM.

If you need no other environments other than the base environment, then I will add those commands and push that image up to Quay so it's accessible to you in the Bridge2AI platform! Will keep you posted

@quinnwai quinnwai mentioned this issue Apr 16, 2024
@quinnwai
Copy link
Member

quinnwai commented Apr 16, 2024

Hi @satra, just created an image and tested that pytorch can locate the GPU on an AWS node! It's on the PR above, let me know if this works for your requirements and we'll go about including it into the actual site. Ended up creating a conda environment instead of installing to base as a quick workaround.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants