Our MLOps Zoomcamp course
- Sign up here: https://airtable.com/shrCb8y6eTbPKwSTL (it's not automated, you will not receive an email immediately after filling in the form)
- Register in DataTalks.Club's Slack
- Join the
#course-mlops-zoomcamp
channel - Tweet about the course!
Teach practical aspects of productionizing ML services — from collecting requirements to model deployment and monitoring.
Data scientists and ML engineers. Also software engineers and data engineers interested in learning about putting ML in production
- Python
- Docker
- Being comfortable with command line
- Prior exposure to machine learning (at work or from other courses, e.g. from ML Zoomcamp)
- Prior programming experience (1+ years of professional experience)
Course start: 16 of May
There are five modules in the course and one project at the end. Each module is 1-2 lessons and homework. One lesson is 60-90 minutes long.
This is a draft and will change.
- What is MLOps
- MLOps maturity model
- Running example: NY Taxi trips dataset
- Why do we need MLOps
- Course overview
- Environment preparation
- Homework
Instructors: Alexey Grigorev
- CRISP-DM, CRISP-ML
- ML Canvas
- Data Landscape canvas
- (optional) MLOps Stack Canvas
- Documentation practices in ML projects (Model Cards Toolkit)
Instructors: Larysa Visengeriyeva
- Experiment tracking intro
- Getting started with MLflow
- Experiment tracking with MLflow
- Saving and loading models with MLflow
- Model registry
- MLflow in practice
- Homework
Instructors: Cristian Martinez
- ML Pipelines: introduction
- Kubeflow Pipelines
- Turning a notebook into a pipeline
Instructors: Theofilos Papapanagiotou
- Batch vs online
- For online: web services vs streaming
- Serving models in Batch mode
- Web services
- Streaming (Kinesis/SQS + AWS Lambda)
- Homework
Instructors: Alexey Grigorev
- ML monitoring VS software monitoring
- Data quality monitoring
- Data drift / concept drift
- Batch VS real-time monitoring
- Tools: Evidently, Prometheus and Grafana
- Homework
Instructors: Emeli Dral
- Devops
- Virtual environments and Docker
- Python: logging, linting
- Testing: unit, integration, regression
- CI/CD (github actions)
- Infrastructure as code (terraform, cloudformation)
- Cookiecutter
- Makefiles
- Homework
Instructors: Alexey Grigorev, Sejal Vaidya
- End-to-end project with all the things above
To make it easier to connect different modules together, we’d like to use the same running example throughout the course.
Possible candidates:
- https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page - predict the ride duration or if the driver is going to be tipped or not
- Larysa Visengeriyeva
- Cristian Martinez
- Theofilos Papapanagiotou
- Alexey Grigorev
- Emeli Dral
- Sejal Vaidya
- Machine Learning Zoomcamp - free 4-month course about ML Engineering
- Data Engineering Zoomcamp - free 9-week course about Data Engineering
I want to start preparing for the course. What can I do?
If you haven't used Flask or Docker
- Check Module 5 form ML Zoomcamp
- The section about Docker from Data Engineering Zoomcamp could also be useful
If you have no previous experience with ML