PyData Global 2024

Understanding Polars data types
12-03, 19:30–20:00 (UTC), Data/ Data Science Track

Polars boasts 18 different data types, not including variants of numerical types.

Do we really need such a vast collection of data types?

What is the use case for each type?

What is the difference between List and Array? Or between Categorical and Enum? And why on Earth would I ever need a Struct?

This talk will clear up all of these questions and more, as we go through the data types that Polars provides and understand why we need each one of them.


According to the documentation, Polars has 18 data types (excluding the varying precision of numerical data types).

The use cases for some data types are intuitively very clear.
For example, we all know when to use Booleans, integers, floating-point numbers, or strings.

Some pairs of data types are fairly easy to understand, but their distinctions can be fuzzy.
For example, when do you use List or Array?
When is Categorical better than Enum and vice-versa?

And some less common data types are poorly understood, like Decimal, Object, or Struct.


Prior Knowledge Expected

No previous knowledge expected

Rodrigo has always been fascinated by problem solving and that is why he picked up programming – so that he could solve more problems. He also loves sharing knowledge, and that is why he spends so much time writing articles in his blog mathspp.com/blog, writing on Twitter @mathsppblog, and giving workshops and courses.
Now, Rodrigo also channels this passion into his role at Polars.

His main areas of scientific interest are mathematics (numerical analysis in particular) and programming in general (with a preference for the Python and APL languages), but Rodrigo also enjoys reading fantasy books, watching silly comedy movies and eating chocolate.