Yaron Haviv is a serial entrepreneur who has deep technological experience in the fields of big data, cloud, storage and networking. Prior to Iguazio, Yaron was the Vice President of Datacenter Solutions at Mellanox, where he led technology innovation, software development and solution integrations. He was also the CTO and Vice President of R&D at Voltaire, a high-performance computing, IO and networking company. Yaron is a CNCF member and one of the authors in the CNCF working group. He presented in various events including KubeCon + CloudNativeCon, Spark + AI Summit and Strata.
Apache Spark has introduced a powerful engine for distributed data processing, providing unmatched capabilities to handle petabytes of data across multiple servers. Its capabilities and performance unseated other technologies in the Hadoop world, but while Spark provides a lot of power, it also comes with a high maintenance cost, which is why we now see innovations to simplify the Spark infrastructure. Kubernetes on its right, offers a simplified way to manage infrastructure and applications. Kubernetes provides a practical approach to isolated workloads, limiting the use of resources, deploying on-demand and scaling as needed. Yaron Haviv will explain how to work with Kubernetes to build a single workflow with Spark based data preparation and ML tasks. Participants will learn how running Spark with Kubernetes enables users to unify analytics and data science on a single cloud-native architecture and eliminate the overhead of an extra big data cluster managed by different tools.
Deploying machine learning models from training to production requires companies to deal with the complexity of moving workloads through different pipelines and re-writing code from scratch. Yaron Haviv will explain how to automatically transfer machine learning models to production by running Spark as a microservice for inferencing, achieving auto-scaling, versioning and security. He will demonstrate how to feed feature vectors aggregated from multivariate real-time and historical data to machine learning models and serverless functions for real-time dashboards and actions.
AI is shifting from batch processes to real-time, event-driven applications, requiring companies to process data at the edge of the network near the data sources. Edge data sources can be anything from a car, or wristwatch, to an industrial component gathering data from multiple machines in a factory. Yaron Haviv will walk you through interactively developing AI applications with Spark ML and Spark streaming which run on hybrid environments, leveraging elasticity in the cloud and high performance at the edge. He will explain how to leverage Spark's extensive set of AI tools and ability to process large data sets to create machine learning models in the public cloud, and seamlessly deploy them at the edge for immediate action. Furthermore, participants will learn how to ensure models are constantly updated across cloud and edge and how to rapidly send sensor data from the edge to the cloud. The session will include live IIoT demos of real-world customer use-cases such as detection of patterns in historical data sets to learn about machinery, predict outcomes and make correlations with real-time fresh data. Session hashtag: #SAISAI7