12-03, 20:00–20:30 (UTC), AI/ML Track
This tutorial introduces Pixeltable, which provides data-centric AI infrastructure with a declarative, incremental approach for multimodal workloads. Participants will learn to manage multimodal data (text, images, video) using Pixeltable's declarative interface. We'll cover data versioning, indexing, and orchestration through computed columns and iterators. Attendees will gain practical experience with Pixeltable's integration capabilities and custom UDFs.
Requirements: Python knowledge, basic ML concepts. Materials will be available via a GitHub repository and Google Colab notebooks.
Pixeltable is a declarative interface for multimodal data, with incremental updates, where data transformations, model inference, and custom logic are embedded as computed columns. This hands-on session will steer participants through Pixeltable's key features and their application in AI workflows. The tutorial will cover:
- Pixeltable basics
- Data ingestion and preprocessing
- Importing diverse data types
-
Using computed columns for transformations
-
Data Transformation/Feature Engineering with computed columns and UDFs
- Implementing custom transformations
-
Integrating with existing ML libraries (Computer Vision Models and LLMs)
-
Working with complex data types and workflows
- Video frame extraction using built-in iterators
- Image processing pipelines
- Vector indexing and similarity search
- Creating and querying vector indexes
- Incremental updates
Q&A
No previous knowledge expected
Before Pixeltable, Pierre worked at Confluent after his company (Noteable) was acquired. Led Amazon’s notebook initiatives (Internally & AWS SageMaker). Prev. worked at Amazon Core AI/ML, helped launch Amazon’s online car leasing store in the EU, and worked on diverse ML projects such as Amazon’s Data Quality Framework (Deequ).