Gil Vernik - Databricks

Gil Vernik

Researcher, IBM Corporation

Gil is a researcher in IBM, storage clouds, security, and analytics group. He received his PhD degree in Mathematics from the University of Haifa and completed a post doctoral position in Germany. In IBM he works with Apache Spark, Hadoop, Object Stores, no-SQL databases. He has more than 25 years of experience as a code developer, both server side and client side, knows Java, Python, Scala, C/C++, and Erlang.


Very Large Data Files, Object Stores, and Deep Learning—Lessons Learned While Looking for Signs of Extra-Terrestrial Life

In this session, IBM will present details on advanced Apache Spark analytics currently being performed through a collaborative project with the SETI Institute, NASA, Swinburne University, Stanford University and IBM. The Allen Telescope Array in northern California has been continuously scanning the skies for over two decades, generating data archives with over 200 million signal events. Come and learn how astronomers and researchers are using Apache Spark, in conjunction with assets such as IBM’s Cognitive Compute Cluster with over 700 GPUs, to train neural net models for signal classification, and to perform computationally intensive Spark workloads on multi-terabyte binary signal files. The speakers will also share details on one of the key components of this implementation: Stocator, an open source (Apache License 2.0) object store connector for Hadoop and Apache Spark, specifically designed to optimize their performance with object stores. Learn how Stocator works, and see how it was able to greatly improve performance and reduce the quantity of resources used, both for ground-to-cloud uploads of very large signal files, and for subsequent access of radio data for analysis using Spark. Session hashtag: #SFeco2