12-04, 16:30–17:00 (UTC), General Track
This talk will go over an application scenario that brings together the benefits of vector search with graph traversal. Knowledge graphs (or more generally, graphs), have long been used to model structured data that capture the connection between entities in the real world. Recently, there has been a lot of interest in the topic of Graph RAG, which aims to use graphs as part of the retrieval process in RAG, to enhance the outcomes. The talk will cover a practical example to showcase how Python developers can leverage the PyData ecosystem alongside two open source, embedded databases: Kùzu for the graph component, and LanceDB for the vector component of the retrieval.
Although vector and hybrid search have proven to be immensely useful for RAG (retrieval augmented generation), there has been more and more evidence that adding structured data organized as a graph, to the retrieval stage, can help improve the contextual relevance of the retrieved results to the generated response. There are several ways in which a graph can be used alongside semantic search in these kinds of applications, so the talk aims to go over some of these methodologies and demonstrate a practical example so that developers can begin experimenting with their own datasets.
Importantly, constructing high quality knowledge graphs (or more generally, graphs), has been a significant challenge to adoption of Graph RAG and graph databases, so this talk will also showcase some ways through which Python data scientists can transform their existing data that may come from various sources, into a graph. We will show how using open source, developer friendly tooling that plays well with the rest of the Python ecosystem can help users rapidly iterate on their ideas while also having the confidence to take their applications to production in a scalable fashion.
Previous knowledge expected
Prashanth is an AI engineer at Kùzu based in Toronto. In recent years, he's focused heavily on data modeling and engineering using relational, graph and vector databases that power a variety of machine learning and AI applications. In his spare time, he enjoys engaging with the Python/Rust community and blogging @ thedataquarry.com.