Harikrishnan Kunhumveettil

Sr. TSE, Databricks

Harikrisnan Kunhumveettil is a Senior Spark TSE at Databricks. In his current role, he works with strategic customers of Databricks and helps in resolving challenging Spark issues and implement the best practices. The area of expertise is Delta, Structured Streaming, and Spark SQL. Harikrishnan is a Big Data Enthusiast with expertize in solving scalability and performance issues in various Distributed systems. He started the career as a Hadoop Application developer and worked on building the first Data lake at Nielsen. He has a decade of experience working on multiple Hadoop platforms and Cloud technologies. Before joining Databricks, at MapR, he led the Spark TSE team.

Past sessions

Summit Europe 2020 Operating and Supporting Delta Lake in Production

November 18, 2020 04:00 PM PT

Delta lake is widely adopted. There are things to be aware of when dealing with petabytes of data in Delta Lake. These smart decisions can give the best efficiency and increase the adoption of Delta. Best practices like OPTIMIZE, ZORDER have to wisely chosen. We have support stories where we successfully resolved performance issues by applying the right performance strategy. There are a set of common issues or repeated questions from our strategic customers face when using Delta and in this session we cover them and how to address them.

Contents:

  • Internals of features like Optimize writes, auto-optimize. When (not) to use it. Trade offs. Common issues
  • How Delta solves a large number of small file problems
  • Dealing with Huge Delta transaction logs. Understanding Delta logs. Why is important to understand the Delta log structure.
  • Delta Lake related Configs & Exceptions
  • Cool tips & tricks using Delta Lake commands

Speakers: Harikrishnan Kunhumveettil and Mathan Pillai