Apache Spark is a unified analytics engine for large-scale, distributed data processing. And Spark MLlib (Machine Learning library) is a scalable Spark implementation of some common machine learning (ML) functionality, as well associated tests and data generators. Data scientists are massively introducing such models in many organizations. But because it’s a new open source technology, it takes time for deployment from IT personas. The result is that several models trained, at least as many are parked in project repositories and only a few of them are really deployed. For all these reasons, Business are seeking for solution to orchestrate and manage their models in a way that company may speeds up model deployment without losing model governance Using SAS Model Manager and SAS Workflow Manager on SAS Viya Platform, the attendees will learn how we provide a model life cycle that govern and orchestrate Spark Mlib models integrating the Apache Spark REST API service (Apache Livy) with SAS Workflow REST API Services. With our work, we provide a business process management solution for build, register, compare, test, approve, publish, monitor, and if needed retrain those models in an automated and controlled manner at the same time. At the end, this automated architecture is a build-once but use-many BPM solution that reduces manual human intervention and accelerates customer capabilities of operationalizing their Spark Mlib models keeping the government of its analytical environment.
SAS Institure srl
Ivan Nardini is a Customer advisor specialized on ModelOps and Decisioning. He’s been involved in operationalizing analytics using different technologies (both SAS native and Open Source) in a variety of industries. His focus is on providing solutions to operationalize analytics and optimize business decisioning processes. To reach this goal, he works with software technologies and cloud.
Artem is a Senior Consultant in Advanced Analytics Platform Practice in SAS Russia. He works closely with banks, insurers and retailers. His primary interest is related to conquering ‘the last mile’ of analytics. He wants companies to be ensured that the lifecycle of their models is efficiently maintained, meaning that all kinds of ML models, despite of the development framework, are properly stored, validated, deployed and monitored.