Sponsored by: Airbyte | GenAI Pipelines: From Data Exploration to Prototype to Production
OVERVIEW
EXPERIENCE | In Person |
---|---|
TYPE | Breakout |
TRACK | Generative AI |
INDUSTRY | Enterprise Technology |
TECHNOLOGIES | AI/Machine Learning, ETL, GenAI/LLMs |
SKILL LEVEL | Intermediate |
DURATION | 40 min |
This talk is your guide to building in-house GenAI data pipelines, from proof-of-concept to production, using open source technology and the ELTP framework. We start by demonstrating how PyAirbyte can facilitate efficient data sourcing, allowing you to quickly explore data from over 250 sources with fewer than 10 lines of Python code. Next we’ll guide you through the steps to elevate your pipelines from initial prototypes to full production. We’ll introduce the ELTP architecture and the pivotal 'Publish' step of ELTP, which is essential for modern pipelines publishing to vector store destinations. As an added bonus, we will share helpful strategies for managing Large Language Model (LLM) “documents” as data, and we’ll contrast these with traditional data forms like records and rows, highlighting their unique requirements for GenAI data management. Join us to gain insights for enhancing your data pipeline capabilities in the GenAI era!
SESSION SPEAKERS
AJ Steers
/Staff Software Engineer, AI
Airbyte