SESSION

How to Cook Good AI Products with What You Already Have in your Data Warehouse

Accept Cookies to Play Video

OVERVIEW

EXPERIENCEIn Person
TYPELightning Talk
TRACKGenerative AI
TECHNOLOGIESAI/Machine Learning, GenAI/LLMs
SKILL LEVELIntermediate
DURATION20 min
DOWNLOAD SESSION SLIDES

Realistic, domain-specific evaluation is the most impactful step AI developers can take to make their products practical and reduce deployment risks. Benchmarks are a good starting point, but they often don’t reflect how generative AI performs in everyday use. In this talk, we’ll show how you can use the data you already have in your enterprise to create reference datasets that fit your specific use cases, domains, and organizational knowledge. By tapping into this data, we can test foundational LLMs on key tasks like customer support and product catalog Q&A. We'll also show results on how significantly performance in real-world settings differs from benchmark predictions, and how to use that knowledge to build better AI products.

SESSION SPEAKERS

Julia Neagu

/CEO
Quotient AI