Skip to main content
<
Page 9
>

Integrating NVIDIA TensorRT-LLM with the Databricks Inference Stack

Over the past six months, we've been working with NVIDIA to get the most out of their new TensorRT-LLM library. TensorRT-LLM provides an easy-to-use Python interface to integrate with a web server for fast, efficient inference performance with LLMs. In this post, we're highlighting some key areas where our collaboration with NVIDIA has been particularly important.

Patronus AI: Using LLMs to Detect Business-Sensitive Information

November 1, 2023 by Emily Hutson in
EnterprisePII is a first-of-its-kind large language model (LLM) data set aimed at detecting business-sensitive information. The challenge of detecting and redacting sensitive business...

Training LLMs at Scale with AMD MI250 GPUs

October 30, 2023 by Abhi Venigalla in
Introduction Four months ago, we shared how AMD had emerged as a capable platform for generative AI and demonstrated how to easily and...

LLM Training on Unity Catalog data with MosaicML Streaming Dataset

Introduction Large Language Models (LLMs) have given us a way to generate text, extract information, and identify patterns in industries from healthcare to...

LLM Inference Performance Engineering: Best Practices

In this blog post, the MosaicML engineering team shares best practices for how to capitalize on popular open source large language models (LLMs)...

Introducing Llama2-70B-Chat with MosaicML Inference

Llama2-70B-Chat is a leading AI model for text completion, comparable with ChatGPT in terms of quality. Today, organizations can leverage this state-of-the-art model...

End-to-End Secure Evaluation of Code Generation Models

With MosaicML, you can now evaluate LLMs and Code Generation Models on code generation tasks (such as HumanEval, with MBPP and APPS coming...

Announcing MPT-7B-8K: 8K Context Length for Document Understanding

July 18, 2023 by Sam Havens and Erica Ji Yuen in
Today, we are releasing MPT-7B-8K, a 7B parameter open-source LLM with 8k context length trained with the MosaicML platform. MPT-7B-8K was pretrained starting...

Training LLMs with AMD MI250 GPUs and MosaicML

June 30, 2023 by Abhi Venigalla in
With the release of PyTorch 2.0 and ROCm 5.4, we are excited to announce that LLM training works out of the box on...

MPT-30B: Raising the bar for open-source foundation models

June 22, 2023 by in
Introducing MPT-30B, a new, more powerful member of our Foundation Series of open-source models, trained with an 8k context length on NVIDIA H100...