PyData Global 2024

The LEGO Approach to designing PyData Workflows
12-05, 10:00–10:30 (UTC), Data/ Data Science Track

What if designing data workflows felt like snapping together LEGO blocks? In this talk, we’ll explore how open-source tools enable flexible, modular PyData workflows. We’ll discuss why open source is essential for avoiding vendor lock-in and how to integrate libraries and frameworks within the Python ecosystem, alongside tools like GitHub Actions. Plus, I’ll introduce DataJourney, an open-source toolkit I developed that makes designing workflows as fun and creative as building with LEGO.

Let’s dive in!


Overview:

This session focuses on building a mental model for iterative design, going from user's need(s) to the design.

Breakdown (30 minutes):

  1. Saya's intro (<1 min)

  2. Introduction to the concept of Mental Model (2 mins)
    - Importance of a mental model in design thinking

  3. Building a Mental Model Requires (5 mins)
    - A Reason: Defining the purpose behind design decisions.
    - Tool-Kit: Tools that support workflow design.
    - Revision: Need for persistence in refining designs + desicions.
    - Feedback: Role of feedback.

  4. The LEGO Way of Designing (6 mins)
    - The LEGO analogy and its relevance to creating modular designs.
    - Why and benefits of using this approach in data workflows.

  5. What Does a Good Design Feel Like? (4 mins)
    - Example/ anecdotes illustrating successful designs.

  6. Open Source: Unlocking the Big Picture (5 mins)
    - How open-source tools facilitate flexible and scalable solutions.
    - Community & collaboration benefits.

  7. DataJourney Walkthrough (5 mins)
    - Story Behind DJ: My motivation for creating DataJourney.
    - DJ Components: Briefly outline the key components of this toolkit.
    - Various Workflows DJ Comprises: Workflows possible with DJ.
    - How It All Comes Together: How components integrate to form a cohesive system.

  8. Closing Notes & Parting Thoughts (<2 mins)
    - Summarize the key takeaways from the session.


Prior Knowledge Expected

Previous knowledge expected

Independent Consultant, Data Scientist & Open Science Advocate.

I lead with a clear focus on the Big picture, turning Data into powerful tools for decision-making and discovery.

👩🏽‍💻 More about my work