Scala at Scale at Databricks
With hundreds of developers and millions of lines of code, Databricks is one of the largest Scala shops around. This post will be…
With hundreds of developers and millions of lines of code, Databricks is one of the largest Scala shops around. This post will be…
It’s been an exciting last few years with the Delta Lake project. The release of Delta Lake 1.0 as announced by Michael Armbrust…
This is a guest authored post by Stephanie Mak, Senior Data Engineer, formerly at Intelematics. This blog post offers my experience of…
We are excited to announce the availability of Apache Spark™ 3.2 on Databricks as part of Databricks Runtime 10.0. We want to thank…
Apache Spark™ Structured Streaming allowed users to do aggregations on windows over event-time. Before Apache Spark 3.2™, Spark supported tumbling windows and sliding…
This is a collaborative post by Ordnance Survey, Microsoft and Databricks. We thank Charis Doidge, Senior Data Engineer, and Steve Kingston, Senior Data…
We’re thrilled to announce that the pandas API will be part of the upcoming Apache Spark™ 3.2 release. pandas is a powerful, flexible…
At Databricks, we want the Lakehouse ecosystem widely accessible to all data practitioners, and R is a great interface language for this purpose…
Our release of Databricks on Google Cloud Platform (GCP) was a major milestone toward a unified data, analytics and AI platform that is…
This post is part of a series of posts on topic modeling. Topic modeling is the process of extracting topics from a set…