Skip to main content
Page 1
>
Engineering blog

GGML GGUF File Format Vulnerabilities

March 22, 2024 by Neil Archibald in Engineering Blog
The GGUF file format is a binary file format used for storing and loading model weights for the GGML library. The library documentation...
Engineering blog

Parameterized queries with PySpark

PySpark has always provided wonderful SQL and Python APIs for querying data. As of Databricks Runtime 12.1 and Apache Spark 3.4, parameterized queries...
Engineering blog

Introducing Apache Spark™ 3.5

Today, we are happy to announce the availability of Apache Spark™ 3.5 on Databricks as part of Databricks Runtime 14.0. We extend our...
Engineering blog

Shared Clusters in Unity Catalog for the win: Introducing Cluster Libraries, Python UDFs, Scala, Machine Learning and more

We are thrilled to announce that you can run even more workloads on Databricks’ highly efficient multi-user clusters thanks to new security and...
Engineering blog

Introducing English as the New Programming Language for Apache Spark

Introduction We are thrilled to unveil the English SDK for Apache Spark, a transformative tool designed to enrich your Spark experience. Apache Spark™...
Engineering blog

Databricks ❤️ Hugging Face

Generative AI has been taking the world by storm. As the data and AI company, we have been on this journey with the...
Engineering blog

Spark Connect Available in Apache Spark 3.4

Last year Spark Connect was introduced at the Data and AI Summit. As part of the recently released Apache SparkTM 3.4, Spark Connect...
Engineering blog

Pandas-Profiling Now Supports Apache Spark

Data profiling is the process of collecting statistics and summaries of data to assess its quality and other characteristics. It is an essential...
Engineering blog

Announcing Ray support on Databricks and Apache Spark Clusters

Ray is a prominent compute framework for running scalable AI and Python workloads, offering a variety of distributed machine learning tools, large-scale hyperparameter...
Engineering blog

Scalable Kubernetes Upgrade Using Operators

December 15, 2022 by Ziyuan Chen in Engineering Blog
At Databricks, we run our compute infrastructure on AWS, Azure, and GCP. We orchestrate containerized services using Kubernetes clusters. We develop and manage...