Databricks
MENUMENU
  • Platform
        • Overview
        • Unified Analytics Platform
          • Collaboration Workspace
          • Delta
          • Runtime or ML
          • Cloud service
        • Genomics
        • Enterprise Security
        • Pricing
        • Why Databricks
        • BY CLOUD ENVIRONMENT
        • Azure
        • AWS
        • BY ROLE
        • Data Engineering
        • Data Science
        • BY TECHNOLOGY
        • Apache Spark
        • TensorFlow
        • MLflow
        • R
        • Snowflake
        • PyTorch
        • Scikit-learn
        • 2019 Registration Now Open: Spark + AI Summit in San Francisco

          REGISTER NOW >

  • Solutions
        • By Use Case
        • Cybersecurity Analytics
        • Deep Learning
        • GDPR
        • Internet of Things
        • Machine Learning & Graph Processing
        • By Industry
        • Advertising & Marketing Technology
        • Energy and Utilities
        • Enterprise Technology & Software
        • Federal Government
        • Financial Services
        • Gaming
        • By Industry Cont.
        • Healthcare & Life Sciences
        • Media & Entertainment
        • Retail & Consumer Packaged Goods
        • Accelerate Discovery with Unified Analytics for Genomics

          LEARN MORE >

  • Customers
  • Learn
        • Dive In
        • Resources
        • Documentation
        • FAQ
        • Forums
        • Certification & Training

        • Training
          • Private Corporate Training
          • Public Training
          • Self-Paced Training
        • Certification
        • Learn Apache Spark Programming, Machine Learning and Data Science, and more

          REGISTER NOW >

  • Partners
        • FEATURED PARTNERS
        • Microsoft
        • AWS
        • RStudio
        • Snowflake
        • Partners
        • Technology
        • Service
        • 2019 Registration Now Open: Spark + AI Summit in San Francisco

          REGISTER NOW >

  • Events
        • Spark + Ai Summit
        • North America 2019
        • Europe 2018
        • Video Archive
        • Other Events
        • Upcoming Local Events
        • Webinars
        • Meetups
        • Global Events
        • Tradeshows
        • WORKSHOPS
        • Data + ML
        • Spark Live
        • Spark + AI Summit Training
        • 2019 Registration Now Open: Spark + AI Summit in San Francisco

          REGISTER NOW >
  • Open Source
        • Apache Spark
        • What is Spark
        • Comparing Spark & Databricks
        • Use Cases
        • Resources
        • Meetups
        • Technical Blogs
        • Other Technologies
        • PyTorch
        • MLflow
        • TensorFlow
        • 2019 Registration Now Open: Spark + AI Summit in San Francisco

          REGISTER NOW >

  • Company
        • Databricks
        • Our Story
        • Careers
        • Leadership
        • Board of Directors
        • Newsroom
        • Blogs
        • All Posts
        • Company Posts
        • Engineering Posts
        • Contact
        • Contact Databricks
        • Join us to help data teams solve the world's toughest problems

          SEE JOBS >
  • Toggle Search
  • Support
  • Contact
  • Log In
  • Try Databricks
Databricks Logo
  • Free Trial
  • Support
  • Log In
  • MENUMENU
    • Platform
          • Overview
          • Unified Analytics Platform
            • Collaboration Workspace
            • Delta
            • Runtime or ML
            • Cloud service
          • Genomics
          • Enterprise Security
          • Pricing
          • Why Databricks
          • BY CLOUD ENVIRONMENT
          • Azure
          • AWS
          • BY ROLE
          • Data Engineering
          • Data Science
          • BY TECHNOLOGY
          • Apache Spark
          • TensorFlow
          • MLflow
          • R
          • Snowflake
          • PyTorch
          • Scikit-learn
          • 2019 Registration Now Open: Spark + AI Summit in San Francisco

            REGISTER NOW >

    • Solutions
          • By Use Case
          • Cybersecurity Analytics
          • Deep Learning
          • GDPR
          • Internet of Things
          • Machine Learning & Graph Processing
          • By Industry
          • Advertising & Marketing Technology
          • Energy and Utilities
          • Enterprise Technology & Software
          • Federal Government
          • Financial Services
          • Gaming
          • By Industry Cont.
          • Healthcare & Life Sciences
          • Media & Entertainment
          • Retail & Consumer Packaged Goods
          • Accelerate Discovery with Unified Analytics for Genomics

            LEARN MORE >

    • Customers
    • Learn
          • Dive In
          • Resources
          • Documentation
          • FAQ
          • Forums
          • Certification & Training

          • Training
            • Private Corporate Training
            • Public Training
            • Self-Paced Training
          • Certification
          • Learn Apache Spark Programming, Machine Learning and Data Science, and more

            REGISTER NOW >

    • Partners
          • FEATURED PARTNERS
          • Microsoft
          • AWS
          • RStudio
          • Snowflake
          • Partners
          • Technology
          • Service
          • 2019 Registration Now Open: Spark + AI Summit in San Francisco

            REGISTER NOW >

    • Events
          • Spark + Ai Summit
          • North America 2019
          • Europe 2018
          • Video Archive
          • Other Events
          • Upcoming Local Events
          • Webinars
          • Meetups
          • Global Events
          • Tradeshows
          • WORKSHOPS
          • Data + ML
          • Spark Live
          • Spark + AI Summit Training
          • 2019 Registration Now Open: Spark + AI Summit in San Francisco

            REGISTER NOW >
    • Open Source
          • Apache Spark
          • What is Spark
          • Comparing Spark & Databricks
          • Use Cases
          • Resources
          • Meetups
          • Technical Blogs
          • Other Technologies
          • PyTorch
          • MLflow
          • TensorFlow
          • 2019 Registration Now Open: Spark + AI Summit in San Francisco

            REGISTER NOW >

    • Company
          • Databricks
          • Our Story
          • Careers
          • Leadership
          • Board of Directors
          • Newsroom
          • Blogs
          • All Posts
          • Company Posts
          • Engineering Posts
          • Contact
          • Contact Databricks
          • Join us to help data teams solve the world's toughest problems

            SEE JOBS >
  • Company Blog
    • Announcements
    • Customers
    • Events
    • Partners
    • Product
    • Security
  • Engineering Blog
    • Apache Spark
    • Ecosystem
    • Machine Learning
    • Platform
    • Streaming
  • See All
Follow @databricks on Twitter
Collapse

Subscribe

  • Blog
  • Newsletter

Follow

  • Follow @databricks on Twitter
  • Follow Databricks on LinkedIn
  • Follow Databricks on Facebook

Company Blog › Ecosystem

Page 1
Next page

How to Work with Avro, Kafka, and Schema Registry in Databricks

February 15, 2019 by Wenchen Fan and Michael Armbrust in Company Blog

In the previous blog post, we introduced the new built-in Apache Avro data source in Apache Spark and explained how you can use it to build streaming data pipelines with the from_avro and to_avro functions. Apache Kafka and Apache Avro are commonly used to build a scalable and near-real-time data pipeline. In this blog post,...

Accelerating Machine Learning on Databricks: On-Demand Webinar and FAQ Now Available!

February 4, 2019 by Hossein Falaki, Adam Conway and Cyrielle Simeone in Company Blog

On January 15th, we hosted a live webinar—Accelerating Machine Learning on Databricks—with Adam Conway, VP of Product Management, Machine Learning, at Databricks and Hossein Falaki, Software Development Engineer and Data Scientist at Databricks. In this webinar, we covered some of the latest innovations brought into the Databricks Unified Analytics Platform for Machine Learning. In particular,...

Kicking Off 2019 with an MLflow User Survey

January 8, 2019 by Matei Zaharia in Engineering Blog

It’s been six months since we launched MLflow, an open source platform to manage the machine learning lifecycle, and the project has been moving quickly since then. MLflow fills a role that hasn’t been served well in the open source community so far: managing the development lifecycle for ML, including tracking experiments and metrics, building...

MLflow v0.8.1 Features Faster Experiment UI and Enhanced Python Model

December 28, 2018 by Aaron Davidson, Corey Zumar and Jules Damji in Engineering Blog

MLflow v0.8.1 was released this week. It introduces several UI enhancements, including faster load times for thousands of runs and improved responsiveness when navigating runs with many metrics and parameters. Additionally, it expands support for evaluating Python models as Apache Spark UDFs and automatically captures model dependencies as Conda environments. Now available on PyPI and...

Introducing Built-in Image Data Source in Apache Spark 2.4

December 10, 2018 by Tomas Nykodym and Weichen Xu in Engineering Blog

Introduction With recent advances in deep learning frameworks for image classification and object detection, the demand for standard image processing in Apache Spark has never been greater. Image handling and preprocessing have their specific challenges - for example, images come in different formats (eg., jpeg, png, etc.), sizes, and color schemes, and there is no...

Apache Avro as a Built-in Data Source in Apache Spark 2.4

November 30, 2018 by Gengliang Wang, Wenchen Fan and Michael Armbrust in Engineering Blog

Apache Avro is a popular data serialization format. It is widely used in the Apache Spark and Apache Hadoop ecosystem, especially for Kafka-based data pipelines. Starting from Apache Spark 2.4 release, Spark provides built-in support for reading and writing Avro data. The new built-in spark-avro module is originally from Databricks’ open source project Avro Data...

Introducing Databricks Runtime 5.0 for Machine Learning

November 27, 2018 by Andy Zhang, Hanyu Cui and Hossein Falaki in Company Blog

Six months ago we introduced the Databricks Runtime for Machine Learning with the goal of making machine learning performant and easy on the Databricks Unified Analytics Platform. The Databricks Runtime for ML comes pre-packaged with many ML frameworks and enables distributed training and inference. Today we are excited to release the second iteration including Conda...

Applying your Convolutional Neural Network: On-Demand Webinar and FAQ Now Available!

November 13, 2018 by Denny Lee and Cyrielle Simeone in Engineering Blog

On October 25th, we hosted a live webinar—Applying your Convolutional Neural Network—with Denny Lee, Technical Product Marketing Manager at Databricks. This is the third webinar of a free deep learning fundamental series from Databricks. In this webinar, we dived deeper into Convolutional Neural Networks (CNNs), a particular type of neural networks that assume that inputs...

Introducing Apache Spark 2.4

November 8, 2018 by Wenchen Fan, Xiao Li and Reynold Xin in Engineering Blog

UPDATED: 11/19/2018 We are excited to announce the availability of Apache Spark 2.4 on Databricks as part of the Databricks Runtime 5.0. We want to thank the Apache Spark community for all their valuable contributions to the Spark 2.4 release. Continuing with the objectives to make Spark faster, easier, and smarter, Spark 2.4 extends its...

Democratizing Cloud Infrastructure with Terraform and Jenkins

October 31, 2018 by Ziheng Liao in Engineering Blog

This blog post is part of our series of internal engineering blogs on the Databricks platform, infrastructure management, integration, tooling, monitoring, and provisioning. This summer at Databricks I designed and implemented a service for coordinating and deploying cloud provider infrastructure resources that significantly improved the velocity of operations on our self-managed cloud platform. The service...

Next Page
  • Product
    • Databricks
    • Feature Comparison
    • Pricing
    • Security
    • Documentation
    • FAQ
    • Forums
  • Apache Spark
    • About Apache Spark
    • SparkHub (Community)
    • Developer Resources
    • Certification
    • Instructor-Led Apache Spark Training
  • Solutions
    • Industries
    • Data Science Teams
    • Data Engineering Teams
    • Use Cases
  • Customers
  • Company
    • About Us
    • Leadership
    • Board of Directors
    • Partners
    • Newsroom
    • Careers
    • Contact Us
  • Blog
    • See All
    • Company Blog
    • Engineering Blog
  • Resources

Databricks Inc.
160 Spear Street, 13th Floor
San Francisco, CA 94105
1-866-330-0121

Contact Us

  • Follow @databricks on Twitter
  • Follow Databricks on LinkedIn
  • Follow Databricks on Facebook
  • Databricks Blog RSS feed
  • Follow Databricks on Youtube

© Databricks . All rights reserved. Apache, Apache Spark, Spark and the Spark logo are trademarks of the Apache Software Foundation.
Privacy Policy | Terms of Use