Databricks
MENUMENU
  • Platform
        • Overview
        • Unified Analytics Platform
          • Collaboration Workspace
          • Delta
          • Runtime or ML
          • Cloud service
        • Genomics
        • Enterprise Security
        • Pricing
        • Why Databricks
        • BY CLOUD ENVIRONMENT
        • Azure
        • AWS
        • BY ROLE
        • Data Engineering
        • Data Science
        • BY TECHNOLOGY
        • Apache Spark
        • TensorFlow
        • MLflow
        • R
        • Snowflake
        • PyTorch
        • Scikit-learn
        • 2019 Registration Now Open: Spark + AI Summit in San Francisco

          REGISTER NOW >

  • Solutions
        • By Use Case
        • Cybersecurity Analytics
        • Deep Learning
        • GDPR
        • Internet of Things
        • Machine Learning & Graph Processing
        • By Industry
        • Advertising & Marketing Technology
        • Energy and Utilities
        • Enterprise Technology & Software
        • Federal Government
        • Financial Services
        • Gaming
        • By Industry Cont.
        • Healthcare & Life Sciences
        • Media & Entertainment
        • Retail & Consumer Packaged Goods
        • Accelerate Discovery with Unified Analytics for Genomics

          LEARN MORE >

  • Customers
  • Learn
        • Dive In
        • Resources
        • Documentation
        • FAQ
        • Forums
        • Certification & Training

        • Training
          • Private Corporate Training
          • Public Training
          • Self-Paced Training
        • Certification
        • Learn Apache Spark Programming, Machine Learning and Data Science, and more

          REGISTER NOW >

  • Partners
        • FEATURED PARTNERS
        • Microsoft
        • AWS
        • RStudio
        • Snowflake
        • Partners
        • Technology
        • Service
        • 2019 Registration Now Open: Spark + AI Summit in San Francisco

          REGISTER NOW >

  • Events
        • Spark + Ai Summit
        • North America 2019
        • Europe 2018
        • Video Archive
        • Other Events
        • Upcoming Local Events
        • Webinars
        • Meetups
        • Global Events
        • Tradeshows
        • WORKSHOPS
        • Data + ML
        • Spark Live
        • Spark + AI Summit Training
        • 2019 Registration Now Open: Spark + AI Summit in San Francisco

          REGISTER NOW >
  • Open Source
        • Apache Spark
        • What is Spark
        • Comparing Spark & Databricks
        • Use Cases
        • Resources
        • Meetups
        • Technical Blogs
        • Other Technologies
        • PyTorch
        • MLflow
        • TensorFlow
        • 2019 Registration Now Open: Spark + AI Summit in San Francisco

          REGISTER NOW >

  • Company
        • Databricks
        • Our Story
        • Careers
        • Leadership
        • Board of Directors
        • Newsroom
        • Blogs
        • All Posts
        • Company Posts
        • Engineering Posts
        • Contact
        • Contact Databricks
        • Join us to help data teams solve the world's toughest problems

          SEE JOBS >
  • Toggle Search
  • Support
  • Contact
  • Log In
  • Try Databricks
Databricks Logo
  • Free Trial
  • Support
  • Log In
  • MENUMENU
    • Platform
          • Overview
          • Unified Analytics Platform
            • Collaboration Workspace
            • Delta
            • Runtime or ML
            • Cloud service
          • Genomics
          • Enterprise Security
          • Pricing
          • Why Databricks
          • BY CLOUD ENVIRONMENT
          • Azure
          • AWS
          • BY ROLE
          • Data Engineering
          • Data Science
          • BY TECHNOLOGY
          • Apache Spark
          • TensorFlow
          • MLflow
          • R
          • Snowflake
          • PyTorch
          • Scikit-learn
          • 2019 Registration Now Open: Spark + AI Summit in San Francisco

            REGISTER NOW >

    • Solutions
          • By Use Case
          • Cybersecurity Analytics
          • Deep Learning
          • GDPR
          • Internet of Things
          • Machine Learning & Graph Processing
          • By Industry
          • Advertising & Marketing Technology
          • Energy and Utilities
          • Enterprise Technology & Software
          • Federal Government
          • Financial Services
          • Gaming
          • By Industry Cont.
          • Healthcare & Life Sciences
          • Media & Entertainment
          • Retail & Consumer Packaged Goods
          • Accelerate Discovery with Unified Analytics for Genomics

            LEARN MORE >

    • Customers
    • Learn
          • Dive In
          • Resources
          • Documentation
          • FAQ
          • Forums
          • Certification & Training

          • Training
            • Private Corporate Training
            • Public Training
            • Self-Paced Training
          • Certification
          • Learn Apache Spark Programming, Machine Learning and Data Science, and more

            REGISTER NOW >

    • Partners
          • FEATURED PARTNERS
          • Microsoft
          • AWS
          • RStudio
          • Snowflake
          • Partners
          • Technology
          • Service
          • 2019 Registration Now Open: Spark + AI Summit in San Francisco

            REGISTER NOW >

    • Events
          • Spark + Ai Summit
          • North America 2019
          • Europe 2018
          • Video Archive
          • Other Events
          • Upcoming Local Events
          • Webinars
          • Meetups
          • Global Events
          • Tradeshows
          • WORKSHOPS
          • Data + ML
          • Spark Live
          • Spark + AI Summit Training
          • 2019 Registration Now Open: Spark + AI Summit in San Francisco

            REGISTER NOW >
    • Open Source
          • Apache Spark
          • What is Spark
          • Comparing Spark & Databricks
          • Use Cases
          • Resources
          • Meetups
          • Technical Blogs
          • Other Technologies
          • PyTorch
          • MLflow
          • TensorFlow
          • 2019 Registration Now Open: Spark + AI Summit in San Francisco

            REGISTER NOW >

    • Company
          • Databricks
          • Our Story
          • Careers
          • Leadership
          • Board of Directors
          • Newsroom
          • Blogs
          • All Posts
          • Company Posts
          • Engineering Posts
          • Contact
          • Contact Databricks
          • Join us to help data teams solve the world's toughest problems

            SEE JOBS >
  • Company Blog
    • Announcements
    • Customers
    • Events
    • Partners
    • Product
    • Security
  • Engineering Blog
    • Apache Spark
    • Ecosystem
    • Machine Learning
    • Platform
    • Streaming
  • See All
Follow @databricks on Twitter
Collapse

Subscribe

  • Blog
  • Newsletter

Follow

  • Follow @databricks on Twitter
  • Follow Databricks on LinkedIn
  • Follow Databricks on Facebook

Announcing Machine Learning Model Export in Databricks

Easily export trained Apache Spark ML models and pipelines

March 7, 2018 Wayne Chan by Wayne Chan Posted in Company Blog March 7, 2018
  • Share article on Twitter
  • Share article on LinkedIn
  • Share article on Facebook

In recent years, machine learning has become ubiquitous in industry and production environments. Both academic and industry institutions had previously focused on training and producing models, but the focus has shifted to productionizing the trained models. Now we hear more and more machine learning practitioners really trying to find the right model deployment options.

In most scenarios, deployment means shipping the trained models to some system that makes predictions based on unseen real-time or batch data, and serving those predictions to some end user, again in real-time or in batches.

This is easier said than done. There are a number of challenges that organizations face deploy these models:

  • Upfront Complexity – Deploying a model into production can require a lot of upfront work that can slow down the deployment process by weeks or more.
  • Disjointed Teams – Sharing models across teams for training and deployment can create challenges as teams try to deal with persistence formats, library dependencies, and different deployment environments.
  • Featurization Logic – There is almost always data processing and featurization logic that proceeds the model application step which adds yet another thing to be implemented in deploying a model.
  • Inconsistent Deployment Environments – Different deployment systems for different scenarios can cause machine learning prediction logic to behave differently, giving subtly incorrect results.

Introducing Machine Learning Export

We are happy to announce the general availability of a powerful new feature called Databricks ML Model Export. This Databricks feature furthers our efforts to unify analytics across data engineering and data science by allowing you to export models and full machine learning pipelines from Apache Spark MLlib. These exported models and pipelines can be imported into other (Spark and non-Spark) platforms to do scoring and make predictions.

This new capability serves as an alternative to batch and streaming prediction within Spark, allowing companies to build low-latency and lightweight machine learning-powered applications with ease.

Seamless Deployment of Models

When speaking with customers, one of the consistent pieces of feedback was that they love to do data science in our platform, but then they have to re-implement the code in a different system to deploy into production. With this new export feature, Databricks can truly serve as an end-to-end platform to build, train, and deploy machine learning models into production with blazing speed and higher reliability.

More Information

To learn more about how to get started with Databricks Machine Learning Export as well as other relevant information, check out the following resources:

  • Documentation
  • Example application
  • On-Demand Webinar: Productionizing Apache Spark MLlib Models for Real-time Prediction Serving – presented by Joseph Bradley and Sue Ann Hong.

Also, look out for a follow on blog that will dive deeper into the inner workings of this new feature.

Databricks Blog

Try Databricks for free. Get started today

Read More

New Databricks Delta Features Simplify Data Pipelines

Continued Innovation and Expanded Availability for the Next-gen Unified Analytics Engine Databricks Delta the next generation unified analytics engine, built on top of Apache Spark,...

Databricks Security Advisory: Critical Runc Vulnerability (CVE-2019-5736)

Databricks became aware of a new critical runc vulnerability (CVE-2019-5736) on February 12, 2019 that allows malicious container users to gain root access to the...

A Guide to Developer, Deep Dive, and Continuous Streaming Applications Talks at Spark + AI Summit

In January 2013 when Stephen O’Grady, an analyst at RedMonk, published “The New Kingmakers: How Developers Conquered the World,” the book’s central argument (then and...

See all Company Blog posts

Share Post

  • Share article on Twitter
  • Share article on LinkedIn
  • Share article on Facebook
  • Product
    • Databricks
    • Feature Comparison
    • Pricing
    • Security
    • Documentation
    • FAQ
    • Forums
  • Apache Spark
    • About Apache Spark
    • SparkHub (Community)
    • Developer Resources
    • Certification
    • Instructor-Led Apache Spark Training
  • Solutions
    • Industries
    • Data Science Teams
    • Data Engineering Teams
    • Use Cases
  • Customers
  • Company
    • About Us
    • Leadership
    • Board of Directors
    • Partners
    • Newsroom
    • Careers
    • Contact Us
  • Blog
    • See All
    • Company Blog
    • Engineering Blog
  • Resources

Databricks Inc.
160 Spear Street, 13th Floor
San Francisco, CA 94105
1-866-330-0121

Contact Us

  • Follow @databricks on Twitter
  • Follow Databricks on LinkedIn
  • Follow Databricks on Facebook
  • Databricks Blog RSS feed
  • Follow Databricks on Youtube

© Databricks . All rights reserved. Apache, Apache Spark, Spark and the Spark logo are trademarks of the Apache Software Foundation.
Privacy Policy | Terms of Use