Splice Machine's use of Apache Spark and MLflow - Databricks

Splice Machine’s use of Apache Spark and MLflow

Download Slides

Splice Machine is an ANSI-SQL Relational Database Management System (RDBMS) on Apache Spark. It has proven low-latency transactional processing (OLTP) as well as analytical processing (OLAP) at petabyte scale. It uses Spark for all analytical computations and leverages HBase for persistence. This talk highlights a new Native Spark Datasource – which enables seamless data movement between Spark Data Frames and Splice Machine tables without serialization and deserialization. This Spark Datasource makes machine learning libraries such as MLlib native to the Splice RDBMS . Splice Machine has now integrated MLflow into its data platform, creating a flexible Data Science Workbench with an RDBMS at its core. The transactional capabilities of Splice Machine integrated with the plethora of DataFrame-compatible libraries and MLflow capabilities manages a complete, real-time workflow of data-to-insights-to-action. In this presentation we will demonstrate Splice Machine’s Data Science Workbench and how it leverages Spark and MLflow to create powerful, full-cycle machine learning capabilities on an integrated platform, from transactional updates to data wrangling, experimentation, and deployment, and back again.



« back
About Gene Davis

Gene is VP of Product Management at Splice Machine. Prior to Splice Machine, Gene ran Product Development for Clio Music, SeeSaw Networks, Blue Martini Software, Fogbreak Software, and TeaLeaf Technology; and was Vice President of Engineering at PeopleSoft. Gene was an original architect of Red Pepper's Advanced Planning System. Prior to Red Pepper, he worked for NASA, where he was a recipient of the Manned Flight Space Award and co-recipient of the Space Act Award. Gene holds a B.A. in Music and B.S. in Chemical Engineering from Stanford University, and an M.Sc. in Computer Science from the University of Toronto