Skip to main content
<
Page 10
>

Announcing MPT-7B-8K: 8K Context Length for Document Understanding

July 18, 2023 by Sam Havens and Erica Ji Yuen in
Today, we are releasing MPT-7B-8K, a 7B parameter open-source LLM with 8k context length trained with the MosaicML platform. MPT-7B-8K was pretrained starting...

Training LLMs with AMD MI250 GPUs and MosaicML

June 30, 2023 by Abhi Venigalla in
With the release of PyTorch 2.0 and ROCm 5.4, we are excited to announce that LLM training works out of the box on...

MPT-30B: Raising the bar for open-source foundation models

June 22, 2023 by in
Introducing MPT-30B, a new, more powerful member of our Foundation Series of open-source models, trained with an 8k context length on NVIDIA H100...

Introducing AI2 OLMo (Open Language Model)

Last month, the Allen Institute for AI (AI2) announced the development of an open, state-of-the-art generative language model: AI2 OLMo (Open Language Model)...

Cloudflare R2 and MosaicML: Train LLMs on Any Compute with Zero Switching Costs

Together, Cloudflare and MosaicML give users the freedom to train LLMs on any compute, anywhere in the world, for faster, cheaper training runs...

Introducing MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs

May 5, 2023 by in
Introducing MPT-7B, the first entry in our MosaicML Foundation Series. MPT-7B is a transformer trained from scratch on 1T tokens of text and...

How We Trained Stable Diffusion for Less than $50k (Part 3)

In our previous blog post, we showed how we used the MosaicML platform, Streaming datasets, and the Composer library to train a Stable...

Benchmarking Large Language Models on NVIDIA H100 GPUs with CoreWeave (Part 1)

April 27, 2023 by Daya Khudia and Vitaliy Chiley in
Benchmarking Large Language Models on NVIDIA H100 GPUs with CoreWeave The research and engineering teams here at MosaicML collaborated with CoreWeave, one of...

Training Stable Diffusion from Scratch for <$50k with MosaicML (Part 2)

We've replicated Stable Diffusion 2 for less than $50k, and we've open-sourced the training code so you can too! This is a 3x...

MosaicBERT: Pretraining BERT from Scratch for $20

With the MosaicBERT architecture + training recipe, you can now pretrain a competitive BERT-Base model from scratch on the MosaicML platform for $20...