PyData Global 2024

Improve LLMs Alignment with Complete and Robust Preference Data
12-04, 11:30–12:00 (UTC), LLM Track

This talk explores how to align large language models (LLMs) with human values via preference learning (PL) in the presence of challenges such as incomplete and corrupted data in preference datasets. We propose a novel method for recalibrating values to tackle these issues, enhancing LLM resilience by improving the robustness of existing models. The session highlights real-world experiments that show how the method addresses adversarial noise and unobserved comparisons, making it essential for building more reliable, ethically aligned AI systems.


This session tackles the challenge of aligning large language models (LLMs) with human values through preference learning (PL), specifically addressing the issues of incomplete and corrupted data in preference datasets. We propose a novel method for recalibrating values in these datasets, which enhances the robustness of existing models, such as the Bradley-Terry-Luce (BTL) model. This method is crucial for improving the reliability and ethical alignment of LLMs, especially in real-world data settings.

This talk is ideal for AI researchers, engineers, and data scientists who have a basic understanding of machine learning and ranking models and are interested in building more reliable AI systems. Attendees will see practical examples and experiments demonstrating how this method strengthens LLMs and ensures resilience against noisy or manipulated data.

By the end of the session, participants will understand how to handle incomplete preference datasets, enhance model robustness against adversarial inputs, and apply these strategies to develop more ethically aligned AI systems.


Prior Knowledge Expected

Previous knowledge expected

Son The Nguyen is a Ph.D. student in Management Information Systems (MIS) at the University of Illinois Chicago, where he is guided by Professor Theja Tulabandhula. Son specializes in human-AI collaboration, with a focus on enhancing both human and model performance. His research bridges AI theory and practical applications, emphasizing AI safety, alignment, and optimizing interactions between humans and AI to harmonize their collaboration.