SESSION

Sponsored by: Airbyte | GenAI Pipelines: From Data Exploration to Prototype to Production

Accept Cookies to Play Video

OVERVIEW

EXPERIENCEIn Person
TYPEBreakout
TRACKGenerative AI
INDUSTRYEnterprise Technology
TECHNOLOGIESAI/Machine Learning, ETL, GenAI/LLMs
SKILL LEVELIntermediate
DURATION40 min

This talk is your guide to building in-house GenAI data pipelines, from proof-of-concept to production, using open source technology and the ELTP framework. We start by demonstrating how PyAirbyte can facilitate efficient data sourcing, allowing you to quickly explore data from over 250 sources with fewer than 10 lines of Python code. Next we’ll guide you through the steps to elevate your pipelines from initial prototypes to full production. We’ll introduce the ELTP architecture and the pivotal 'Publish' step of ELTP, which is essential for modern pipelines publishing to vector store destinations. As an added bonus, we will share helpful strategies for managing Large Language Model (LLM) “documents” as data, and we’ll contrast these with traditional data forms like records and rows, highlighting their unique requirements for GenAI data management. Join us to gain insights for enhancing your data pipeline capabilities in the GenAI era!

SESSION SPEAKERS

AJ Steers

/Staff Software Engineer, AI
Airbyte