Rony Chatterjee - Databricks

Rony Chatterjee

Senior Product Manager, Microsoft

Rony is a Senior Product Manager Azure Data at Microsoft. He is responsible for building the next generation of Big Data products for enterprise customers. Rony leads the product management for in-database Machine Learning and Artificial Intelligence in products and services like Azure SQL, Azure SQL Managed Instance, SQL Server Big Data Clusters and SQL Server on Windows and Linux. Rony works very closely with Microsoft Research to drive next generation innovation in our products and services. Rony has a PhD in Computer Science and has several research papers published in IEEE/ACM, Springer book chapters and journal publications.


Productionizing Machine Learning with Apache Spark, MLflow and ONNX from the ground to cloud using SQL ServerSummit 2020

One of the biggest challenges which customers face is how to productionize machine learning for enterprises. Once the Data scientist, Data Engineers, Business analyst, Machine learning engineer have successfully built their Machine Learning Models, they need model management a system that manages and orchestrates the entire lifecycle of machine learning models. Analytical models must be trained, compared and monitored before deploying into production, requiring many steps to take place to operationalize a model's lifecycle. We have been looking at MLflow to be our open source platform to manage the ML lifecycle, including experimentation, reproducibility and deployment. One of the features we have introduced into MLflow contributing back to the community is the ability to store models into backend SQL Server as the model artifact store.

The key focus here is that 'models are just like data' to an engine like SQL Server, and as such we can leverage most of the mission-critical features of data management built into SQL Server for machine learning models. Using SQL Server for ML model management, an organization can create an ecosystem for harvesting analytical models, enabling data scientists and business analysts to discover the best models and promote them for use. SQL Server treats models just like data – storing them as serialized varbinary objects. SQL Server keeps the models 'close' to data, thus leveraging all the capabilities of a Management System for Data to be now nearly seamlessly transferable to machine learning models. This can help simplify the process of managing models tremendously resulting in faster delivery and more accurate business insights. We will also discuss how we are leveraging ONNX runtime in SQL and convert these models to ONNX and deploy the models on Edge for native predictions on the data.