some issues adding packages/models for the b2ai:voice project #2

satra · 2024-04-13T16:04:34Z

i believe permissions of /opt/conda are off with the python installation in the core cpu-root package, but i couldn't determine exactly what as the Dockerfile for the root package is not directly available (i could see the steps in quay and likely has to do with some of the chowning and likely requires chowning /opt/conda to belong to $NB_USER as well.).

how to replicate:

docker run -it --rm -p 8888:8888 quay.io/ohsu-comp-bio/bridge2ai-jupyter:cpu-root

in a jupyterlab terminal do: pip install opensmile (this will error out):

ERROR: Could not install packages due to an OSError: [Errno 13] Permission denied: '/opt/conda/lib/python3.10/site-packages/tests/test_remix.py'
Consider using the --user option or check the permissions.

indeed it seems like a permissions issue as the following two options both succeed:

as root:

docker run -it --rm -p 8888:8888 --user root --entrypoint bash quay.io/ohsu-comp-bio/bridge2ai-jupyter:cpu-root

and then pip install opensmile

as jovyan but in an alternate conda environment:

mamba create -n py310 python=3.10 pip
mamba init
source ~/.bashrc
mamba activate py310
pip install opensmile

if the issue can be fixed in the core image, that would be great, otherwise i'll send in a PR to create a separate environment. in this environment, pip install b2aiprep ipympl huggingface_hub[cli] would be sufficient to install most packages. in addition, we will want to download some models from huggingface. assuming that the home directory is not remounted when running, otherwise replace model dir to a different location.

export MODEL_DIR=${HOME}/models
huggingface-cli download --local-dir ${MODEL_DIR}/speechbrain/spkrec-ecapa-voxceleb speechbrain/spkrec-ecapa-voxceleb --cache-dir ${MODEL_DIR}/cache
huggingface-cli download --local-dir ${MODEL_DIR}/openai/whisper-base openai/whisper-base --cache-dir ${MODEL_DIR}/cache

downloading models can start adding size easily to the image, so it may be useful to provide a shared space from where people can use models if downloads are not allowed.

cuda installs

for pytorch and tensorflow, both come with pip installable cuda packages, hence the cuda install in gpu-slim may not be required, unless other non-python packages depend on it.

The text was updated successfully, but these errors were encountered:

quinnwai · 2024-04-15T18:23:36Z

Hi Satra, thanks for pointing this out! Just to confirm, you guys will be using one of these images as a base image to push your own to Dockerhub?

If so, let me try and tackle each point...

Pip download as non-root user (conda issue)

To my understanding of our use case, all the package installation should occur in the setup while as root. I briefly tested that you can create a conda environment with opensmile installed by adding it to the Dockerfile for any of the restricted-gpu images in the repo...

FROM quay.io/ohsu-comp-bio/bridge2ai-jupyter:cpu-root

# misc setup specific to image
...

# create environment with tensorflow and pytorch
RUN conda create --name env
SHELL ["conda", "run", "-n", "env", "/bin/bash", "-c"]
RUN pip install opensmile
# RUN pip install tensorflow==2.14.0
# RUN pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
RUN conda init bash
RUN exec bash

# return to non-root user
USER $NB_UID
WORKDIR /home/$NB_USER

Would creating a conda environment within the Docker environment work for your requirements? If not, I can dig deeper into the permissions issues. The cpu-root Dockerfile is also up to to give you more context if needed.

Downloading HuggingFace models

Since these models seem relatively small (<1 GB), feel free to add that to your image as well, as once the container is created, the Jupyter labs environment for the open house user will be in a secure enclave setting and should not accessible to HuggingFace.

Cuda Installs

If that is the case, feel free to just build on top of cpu-root instead of one of the restricted-gpu images! Other than that, jsut make sure to add the user change to the end of your Dockerfile to make sure the open house user only only has non-root access...

# return to non-root user
USER $NB_UID
WORKDIR /home/$NB_USER

Hope that works for your requirements and feel free to reach out again if I missed anything.

quinnwai · 2024-04-16T01:11:23Z

Hi, following up on how to download opensmile, let me know if this works for you:

conda create --name test-env
conda activate test-env
conda install python=3.10
pip install opensmile

In my initial testing, this allowed me to point the python to /opt/conda/envs/test-env instead of /opt/conda/lib/ so I didn't run into the error.

satra · 2024-04-16T02:23:22Z

@quinnwai thank you. most of the time we don't care if people can overwrite the base conda environment within the instance, hence i wasn't expecting it to be "write" protected.

regarding container, we don't have to push our own image. we will do whatever is most useful to you. if you are planning to have a single environment, then just the following commands would be sufficient i believe for our needs:

RUN pip install --no-cache-dir b2aiprep ipympl huggingface_hub[cli]
RUN MODEL_DIR=${HOME}/models && \
   huggingface-cli download --local-dir ${MODEL_DIR}/speechbrain/spkrec-ecapa-voxceleb speechbrain/spkrec-ecapa-voxceleb --cache-dir ${MODEL_DIR}/cache && \
   huggingface-cli download --local-dir ${MODEL_DIR}/openai/whisper-base openai/whisper-base --cache-dir ${MODEL_DIR}/cache

if the image is run on a node/instance with nvidia drivers installed, cuda should be usable. just a note that by default the above will install cuda 12, hence if the node drivers do not support cuda 12, we should force the cuda 11 packages

satra · 2024-04-16T02:38:22Z

also i'm just seeing your latest message. if you would prefer creating a new environment for the voice project, i would suggest using python 3.11 (it's significantly better than 3.10 on a bunch of multiprocessing related issues).

quinnwai · 2024-04-16T16:10:46Z

@satra Thanks for the response! That is true, it makes sense that we shouldn't need have permission restrictions on the user in their own personal VM.

If you need no other environments other than the base environment, then I will add those commands and push that image up to Quay so it's accessible to you in the Bridge2AI platform! Will keep you posted

quinnwai · 2024-04-16T22:13:47Z

Hi @satra, just created an image and tested that pytorch can locate the GPU on an AWS node! It's on the PR above, let me know if this works for your requirements and we'll go about including it into the actual site. Ended up creating a conda environment instead of installing to base as a quick workaround.

quinnwai mentioned this issue Apr 16, 2024

voice image #6

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

some issues adding packages/models for the b2ai:voice project #2

some issues adding packages/models for the b2ai:voice project #2

satra commented Apr 13, 2024

quinnwai commented Apr 15, 2024 •

edited

Loading

quinnwai commented Apr 16, 2024 •

edited

Loading

satra commented Apr 16, 2024

satra commented Apr 16, 2024

quinnwai commented Apr 16, 2024

quinnwai commented Apr 16, 2024 •

edited

Loading

some issues adding packages/models for the b2ai:voice project #2

some issues adding packages/models for the b2ai:voice project #2

Comments

satra commented Apr 13, 2024

cuda installs

quinnwai commented Apr 15, 2024 • edited Loading

quinnwai commented Apr 16, 2024 • edited Loading

satra commented Apr 16, 2024

satra commented Apr 16, 2024

quinnwai commented Apr 16, 2024

quinnwai commented Apr 16, 2024 • edited Loading

quinnwai commented Apr 15, 2024 •

edited

Loading

quinnwai commented Apr 16, 2024 •

edited

Loading

quinnwai commented Apr 16, 2024 •

edited

Loading