Mathan Pillai

BigData Solutions Engineer, Databricks

Mathan Pillai is a Senior Spark Technical Solutions engineer and a machine learning practitioner with over 15 years of industry experience. He helps customers solve their Big data problems using spark SQL, Delta lake, and Structured Streaming. He has extensive expertise in programming languages & spark framework. Before joining Databricks, he was a technical lead helping some of the insurance and real estate companies deploy their big data ETL pipelines using spark.

Past sessions

Summit Europe 2020 Operating and Supporting Delta Lake in Production

November 18, 2020 04:00 PM PT

Delta lake is widely adopted. There are things to be aware of when dealing with petabytes of data in Delta Lake. These smart decisions can give the best efficiency and increase the adoption of Delta. Best practices like OPTIMIZE, ZORDER have to wisely chosen. We have support stories where we successfully resolved performance issues by applying the right performance strategy. There are a set of common issues or repeated questions from our strategic customers face when using Delta and in this session we cover them and how to address them.

Contents:

  • Internals of features like Optimize writes, auto-optimize. When (not) to use it. Trade offs. Common issues
  • How Delta solves a large number of small file problems
  • Dealing with Huge Delta transaction logs. Understanding Delta logs. Why is important to understand the Delta log structure.
  • Delta Lake related Configs & Exceptions
  • Cool tips & tricks using Delta Lake commands

Speakers: Harikrishnan Kunhumveettil and Mathan Pillai

Mathan Pillai