12-04, 11:30–12:00 (UTC), Data/ Data Science Track
To apply or not to apply, that is the question.
Causal reasoning elevates predictive outcomes by shifting from “what happened” to “what would happen if”. Yet, implementing causality can be challenging or even infeasible in some contexts. This talk explores how the very act of assessing its applicability can add value to your projects. Through a gentle introduction to causal inference tools and practical use cases, you will learn how to bring greater scientific rigour to real-world problems.
Target audience: Practicing and aspiring data scientists, machine learning engineers, and analysts looking to improve their decision-making with causal inference.
No prior knowledge is assumed.
For the seasoned practitioners I hope to shine light on aspects that may not have been considered. 💡
Can't make the talk? Read all about it in my new TDS article: 🧠🧹 Causality — Mental Hygiene for Data Science
This talk aims to provide practitioners with a practical understanding of causal reasoning and its applications in real-world projects. It moves beyond the hype surrounding causal inference (CI) as the "next big thing" and encourages a critical assessment of its strengths and limitations.
Outline
- Introduction: The Allure and the Reality of Causality
- Case Study: A Paradox in Machine Learning
- Causal Graphs
- as Blueprints for Causal Thinking
- Building by Causal Discovery
- Assessing Applicability with Identifiability
- Summary and Resources
Central Thesis
Although causal inference techniques have immense potential, their practical application requires careful consideration of context and limitations. The act of engaging in causality is valuable in its own right, promoting a more rigorous and insightful approach to a project. In particular, "causal thinking" through Causal Graph construction and critical assessment of assumptions, is a mental exercise that enhances project rigour which may lead to a deeper understanding of what is possible with the data.
In Detail
A key focus will be on Directed Acyclic Graphs (DAGs) as powerful tools for causal thinking. They provide a visual aid of the data generation process, mapping out the relationships between different parameters and their dependencies. We'll examine how DAGs can articulate the understanding of a system. A second focus will be on Identifiability a process of identifying minimum sets of parameters required to answer specific questions about cause and effect.
Key Takeaways
This talk will equip you with a practical framework for:
- Building a DAG as a visualisation tool to communicate your understanding of the data generation process and assess the level of control you have over the system being studied.
- Using Identifiability on a DAG to assess the applicability of CI to your own projects, considering factors like data availability, project goals.
By doing so you will embrace "causal thinking" as a mental hygiene practice that can benefit any data science project, even if formal CI techniques cannot be employed.
By the end of this talk, you'll have a nuanced understanding of CI, going beyond the hype to appreciate its true potential and limitations. You'll be better equipped to assess when it's is the right tool for the job, and you'll gain valuable insights into how "causal thinking" can enhance your work as a data scientist, especially with real-world data.
For related Towards Data Science publications see Eyal's Medium profile:
No previous knowledge expected
👋 Hi I'm Eyal. My superpower is simplifying the complex and turning data to ta-da!
I'm an Ex-cosmologist turned data scientist with over 15 years experience in solving challenging problems. I am motivated by intellectual challenges, highly detail oriented and love visualising data results to communicate insights for better decisions within organisations.
My main drive as a data scientist is applying scientific approaches that result in practical and clear solutions. To accomplish these, I use whatever works, be it statistical/causal inference, machine/deep learning or optimisation algorithms. Being result driven I have a passion for facilitating stakeholders to make data driven decisions by quantifying and communicating the impact of interventions to non-specialist audiences in an accessible manner.
My claim for fame is that between 2004-2014 I lived in four different continents within a span of a decade, including three tennis Grand Slam cities (NYC, Melbourne, London).