This is a template repository for Python-based Machine Learning projects.
-
Integrated CI pipeline for Python 3.10 and Poetry managed projects (uncomment
make test
when you're ready to have tests in dev or your pipeline) -
Makefile
-
Dockerfile
-
src package
-
tests package
-
main.py
-
pyproject.toml
-
setup.sh
-
user-story.md Issue Template (for story-driven development)
├── .github
│ |── ISSUE_TEMPLATE
│ | └── user-story.md
│ └── workflows
│ └── cicd.yaml
├── src
│ └── __init__.py
|── tests
| └── __init__.py
|── .gitignore
├── Dockerfile
├── Makefile
├── README.md
├── main.py
├── pyproject.toml
└── setup.sh
This repository is a GitHub Template that you can use to create a new repository for Python-based machine learning projects. It comes pre-configured to use Python3.10 with Poetry 1.5.1 as a package manager.
To get started you can:
-
Click the
Use this template
button at the top right of the page. This will let you create a new repository set up with all of the resources in this template. -
You can also directly clone this repository:
git clone https://github.com/christopherkeim/python-template.git
Note: for the moment I've targeted Ubuntu 20.04/22.04 development environments for automated setup.
- Once you have local copy of this repository in your development environment, navigate into this directory and run the
setup.sh
script:
cd python-ml-template
bash setup.sh
This will install Poetry 1.5.1 and Python3.10 into your development environment.
- You can configure any dependencies you'd like using the
pyproject.toml
file:
[tool.poetry.dependencies]
python = ">=3.10, <3.11"
# DevOps
black = "^22.3.0"
click = "^8.1.3"
pytest = "^7.4.0"
pytest-cov = "^4.1.0"
ruff = "^0.0.285"
# Web
requests = "^2.31.0"
fastapi = "^0.103.1"
pydantic = "^2.3.0"
uvicorn = "^0.23.2"
# Data Science
jupyter = "^1.0.0"
pandas = "^1.5.0"
numpy = "^1.23.3"
scikit-learn = "^1.1.2"
matplotlib = "^3.6.0"
seaborn = "^0.12.0"
# MLOps
wandb = "^0.15.10"
- Once you're happy with your defined dependencies, you can run
make install
(orpoetry install
directly) to install the Python dependencies for your project into a virtual environment (pre-configured to be placed in your project's directory):
make install
- This will create a
poetry.lock
file defining exactly what dependencies you're using in development and testing. It's recommended that you check this file into version control so others can recreate this on their machines 💻 and in production 🚀.
- You're all set to start developing 🐍 🚀 ✨.
You'll want to edit the README.md
and replace the CI badge with a hook for your specific repository's GitHub Actions CI workflow.
You can also add a deploy target by editing your Makefile
or the cicd.yaml
GitHub Actions workflow file.
As I'm learning more about DevOps and the joys of dependency management in Python projects, I've noticed that Software Engineering and MLOps minded folks tend to like Poetry. There's a few reasons I think Poetry is a solid choice for setting your code up to survive across different environments at the level of dependency management:
-
It allows you to express what primary dependencies you believe your application will work with using the
pyproject.toml
file, and allow for upgrade paths down the road -
Unlike
pip
, thepoetry.lock
file lets you define exactly what dependencies you're using in development and testing. This means your Python dependency structure can be exactly replicated on other machines, every time. -
Poetry has very convenient virtual environment management (which we've configured here to be placed within your project directory)