Today, we are releasing MPT-7B-8K, a 7B parameter open-source LLM with 8k context length trained with the MosaicML platform. MPT-7B-8K was pretrained starting...
Last month, the Allen Institute for AI (AI2) announced the development of an open, state-of-the-art generative language model: AI2 OLMo (Open Language Model)...
Benchmarking Large Language Models on NVIDIA H100 GPUs with CoreWeave The research and engineering teams here at MosaicML collaborated with CoreWeave, one of...
With the MosaicBERT architecture + training recipe, you can now pretrain a competitive BERT-Base model from scratch on the MosaicML platform for $20...