Elasticsearch provides native integration with Apache Spark. However, especially during development, it is at best cumbersome to have Elasticsearch running in a separate machine/instance. But how do you run Elasticsearch inside a Spark Cluster? And if you do get it running, can you still make use of native ES integration? What libraries would you need? Can you take/restore snapshots? And how would the setup look like? Oscar will give you an in-depth look into setting up and using Elasticsearch inside a Spark Cluster for development purposes. Attendees will leave this talk with a thorough understanding of how to setup an Elasticsearch in-memory instance, how to read and write to that instance and how to perform useful actions like snapshot/restore to S3. With Elasticsearch running inside your Spark Cluster developing with Elasticsearch on Spark will feel like a breeze!
Oscar studied Computer Science at Delft University of Technology. He’s now Data Scientist at Xoom a PayPal service. Oscar is interested in Data Management, Dataset Search, Online Learning to Rank, and Apache Spark.