PyData Global 2024

Time Series Analysis with StatsModels
12-05, 13:00–14:30 (UTC), Data/ Data Science Track

Time series analysis provides essential tools for modeling and predicting time-dependent data, especially data exhibiting seasonal patterns or serial correlation. This tutorial covers tools in the StatsModels library including seasonal decomposition and ARIMA. We'll develop the ARIMA model bottom-up, implementing it one piece at a time, and then using StatsModels. As examples, we'll look at weather data and electricity generation from renewable sources in the United States since 2004 -- but the methods we'll cover apply to many kinds of real-world time series data.


Outline

  • Introduction to time series
  • Overview of the data
  • Seasonal decomposition, additive model
  • Seasonal decomposition, multiplicative model
  • Serial correlation and autoregression
  • ARIMA
  • Seasonal ARIMA

Prerequisites

I assume that you are familiar with Python at an intermediate level. I use NumPy, SciPy, and Pandas, but I explain what you need to know as we go. You don't need to know anything about time series analysis.

I'll provide Jupyter notebooks that run on Colab, so you don't have to install anything or prepare ahead of time. But you should be familiar with Jupyter notebooks.


Prior Knowledge Expected

Previous knowledge expected

Allen Downey is a professor emeritus at Olin College and Principal Data Scientist at PMC Labs. He is the author of several books -- including Think Python, Think Stats, and Probably Overthinking It -- and a blog about programming and data science. He is a consultant and instructor specializing in Bayesian statistics. He received a Ph.D. in computer science from the University of California, Berkeley, and Bachelor's and Masters degrees from MIT.