Skip to main content
Page 1
Engineering blog

Offline LLM Evaluation: Step-by-Step GenAI Application Assessment on Databricks

Background In an era where Retrieval-Augmented Generation (RAG) is revolutionizing the way we interact with AI-driven applications, ensuring the efficiency and effectiveness of...
Engineering blog

Announcing MLflow 2.8 LLM-as-a-judge metrics and Best Practices for LLM Evaluation of RAG Applications, Part 2

Today we're excited to announce MLflow 2.8 supports our LLM-as-a-judge metrics which can help save time and costs while providing an approximation of...