12-05, 16:00–17:30 (UTC), AI/ML Track
This hands-on tutorial guides participants through the process of constructing the essential components of a Machine Learning Platform (MLP) from scratch. We'll focus on implementing five core elements: a feature store, model registry, orchestrator, inference engine, and basic monitoring system. The session emphasizes practical, hands-on coding using Test-Driven Development (TDD), Domain Driven Design, and hexagonal architecture principles providing attendees with a functional foundation for a robust ML infrastructure.
In this intensive 90-minute tutorial, participants will build a streamlined Machine Learning Platform (MLP) focusing on core functionalities. This session is designed for data scientists, machine learning engineers, and software developers who want to gain hands-on experience in constructing the fundamental components of a machine learning infrastructure.
Outline
Introduction and Setup (10 minutes)
- Overview of the ML platform architecture
- Brief Review of Domain Driven Design
Inference Engine (30 minutes)
- Building a straightforward model serving component
- Implementing prediction functionality
Model Registry & Feature Store (15 minutes)
- Creating a simple model versioning system
- Storing and retrieving model metadata
Event Driven Design & Message Bus (30 minutes)
- Overview of Event Driven Design
- Developing a basic workflow management system
Model Trainer (Time Permitting)
- Simple model trainer
Challenge Questions (5 minutes)
- Additional considerations to take the project forward.
- Brief discussion on scaling and additional components (monolith, microservice, cloud native)
Resources for further learning
Throughout the tutorial, we will borrow principles from domain driven design to formalize the bounded contexts within an ML Platform. We will also emphasize the importance of writing tests first, demonstrating how TDD can lead to more robust and reliable ML infrastructure components.
Participants will follow along, building and testing each component in real-time. By the end of the session, attendees will have a functional, well-tested, albeit basic, machine learning platform that they can further expand and customize.
Requirements:
- Intermediate understanding of Python programming
- Basic familiarity with machine learning concepts
- Basic knowledge of unit testing (beneficial but not required, pytest will also be helpful)
- Laptop with Python 3.12+ installed
Materials:
All necessary code, tests, and documentation will be provided through a GitHub repository. The link to the repository will be shared with participants after the session. Attendees are encouraged to clone the repository and install the required dependencies before the tutorial begins.
By the end of this tutorial, participants will have gained practical insights into building the core components of a machine learning platform using TDD. They'll understand how these essential elements interact to form a basic ML infrastructure, providing a solid foundation for further exploration and implementation in their own projects.
Previous knowledge expected
Nathan Colbert is an ML professional with 5 years experience building, deploying, and owning end-to-end ML Systems. Nathan works at Peacock as a Senior Manager of ML Architecture where he is focused on accelerating ML delivery across the organization.