Jaws - Data Warehouse with Spark SQL - Databricks

Jaws – Data Warehouse with Spark SQL

Download Slides

Today there are more and more companies having lots of structured data that needs to be verified, transformed, and analysed. This requires a data warehouse built for the purpose of large scale advanced analytics. This talk is about one contribution we have made to the spark ecosystem (Jaws), an open source data warehouse built on top of Spark SQL, warehouse that enables users to efficiently analyze the data helping to take business decisions. Focusing on performance, scalability and analytics, Jaws aims to deliver business value through the analysis of data. Jaws offers the possibility to submit queries concurrently and asynchronously on top of a managed Spark Sql context. One of the strengths of this data warehouse is the support for in memory processing using Tachyon. During this presentation, we will go through Jaws main features, we will speak about the architectural decisions we made for building this highly scalable and resilient data warehouse and also we will speak about fine tuning the Spark Sql context in order to obtain the best performance during the data analyzing.