PyData Global 2024

Boosting AI Reliability: Uncertainty Quantification with MAPIE
12-04, 10:00–11:30 (UTC), AI/ML Track

MAPIE (Model Agnostic Prediction Interval Estimator) is your go-to solution for managing uncertainties and risks in machine learning models. This Python library, nestled within scikit-learn-contrib, offers a way to calculate prediction sets with controlled coverage rates for regression and classification tasks.

But it doesn't stop there - MAPIE can also be used to handle more complex tasks like time series analysis, multi-label classification, computer vision and natural language processing, ensuring probabilistic guarantees on crucial metrics.

Join us as we delve into the world of conformal predictions and how to quickly manage your uncertainties using MAPIE.


This talk introduces MAPIE, an open-source Python library designed to quantify uncertainties and control risks in machine learning models. Learn how to compute conformal prediction intervals and control risks in various tasks such as regression and classification, and even complex tasks like time series regression, multi-label classification, computer vision and natural language processing.

We will begin by discussing the importance of uncertainty quantification and risk control in machine learning models. Then, we will dive into the key features of MAPIE, including:

  1. Computing conformal prediction intervals for regression and classification tasks with guaranteed marginal coverage rates.
  2. Controlling risks for complex tasks such as time series regression, multi-label classification, computer vision or natural language processing.
  3. Wrapping any machine learning model (scikit-learn, TensorFlow, PyTorch, etc.) with a scikit-learn-compatible wrapper for uncertainty quantification and risk control.

Throughout the talk, we will demonstrate MAPIE's capabilities with practical examples and code snippets. Attendees will learn how to apply MAPIE to their own models, ensuring more reliable and robust predictions.

This talk targets data scientists, machine learning engineers, and researchers with a basic understanding of machine learning concepts. Familiarity with scikit-learn and other popular machine learning libraries is helpful but not required.

By the end of the talk, attendees will have gained valuable insights into uncertainty quantification and risk control, as well as hands-on experience using MAPIE to bring uncertainty quantification to their machine learning models.


Prior Knowledge Expected

Previous knowledge expected

See also:

Thibault Cordier is a Data and Research Scientist at Capgemini Invent, where he is a member of the Lab Invent team in France and serves as the technical leader of the MAPIE project.

Prior to joining the research team at Capgemini Invent, he earned his PhD in Computer Science in 2023 at Avignon University.

Up to now, his research has focused on distribution-free inference and conformal prediction, with applications in computer vision, natural language processing, and time series analysis.

Senior Data Scientist @ Capgemini Invent

Leading the team behind MAPIE, an open-source library within the sklearn-contrib ecosystem, focused on conformal predictions.

After earning a MSc in Computer Science from École Centrale, I spent a few years in product management before returning to more technical roles.

Let’s connect!

Hussein Jawad is a Senior Data Scientist specializing in NLP, holding degrees from École Polytechnique and Télécom Paris. Based in Paris, he possesses a foundation in programming, statistical modeling, and MLOps.

Currently, he works on the development team of MAPIE while delivering innovative solutions at Capgemini Invent. With publications on LLM security and achievements in global competitions, he combines technical expertise with cross-functional collaboration.