PyData Global 2024

Solving Forecasting Problems in R and Python
12-05, 17:00–17:30 (UTC), General Track

This talk will explain how to solve business forecasting problems using time series methods. Time series forecasting remains a specialty topic. Because of this you really want to use a package tuned for your use case and specialized to deal with the difficulties inherent in time series forecasting. I will share a simplified problem notation that helps you select between time series packages in R and Python.


Time series forecasting differs from typical predictive analytics in a number of important details. The specialist literature treats the problem as a general system identification problem that obscures points important business considerations. In this talk we start with Robert Hyndman's observation that ARIMAX models (the name for a large family of time series solvers that allow external variables) "are a muddle." It has become hard to understand what a time series package even claims to do.

I will teach how to translate time series terminology back into business requirements and how to select the correct package in R or Python to solve your problem. We include R in this talk, as each package has different trade-offs and some of the R packages offer important capabilities. I will also touch on the use of Stan for these problems both from Python and from R.

Participants will take away how to apply time series methods to solve their specific problems, and tools for evaluating and selecting time series packages.


Prior Knowledge Expected

No previous knowledge expected

Win-Vector Principal Consultant and Trainer John Mount has a Ph.D. in computer science from Carnegie Mellon and over 15 years of applied experience in biotech research, online advertising, price optimization and finance. He is one of the authors of the popular book "Practical Data Science with R", Manning, 2020 (now in its second edition).