Nagaraj Sengodan - Databricks

Nagaraj Sengodan

Senior Manager, HCL Technologies

Experience of Architecture of large scale Business Intelligence/Data Warehouse system and Technical leadership in data warehousing and Business Intelligence areas. I’ve strong ability to build competency in niche technology practice areas with a strategic mix of subject matter expertise, industry point-of-views, success story deep dives and building solution accelerators. In addition, also institutionalize processes to hire & ramp-up account teams. Responsible for strategic initiatives in the Emerging Technologies Practice MS-Advanced-Analytics (cognitive services – BOT, LUIS, AML, ML workbench etc‰€? and cortana services – HD Insight, Spark, hive, ADL, USQL, ADF, SQLDW, SA, EH etc‰€? )


Databricks Delta Lake and Its BenefitsSummit Europe 2019

Delta Lake, an open-source storage layer which brings ACID transactions to Apache Spark and big data workloads. Key takeaways: Audience can gain knowledge on what is Delta Lake and how Delta Lake can help us to save time and computational power. Audience can gain knowledge on various features supported by Delta Lake. Audience can gain knowledge on how they can implement delta lake in their environment to gain the benefits. Audience can get clear picture on how delta lake works with an example Below are the various features that will be covered in the presentation. ACID Transactions Data lakes have multiple data pipelines reading and writing data concurrently, and data engineers have to go through a tedious process to ensure data integrity, due to the lack of transactions. Delta Lake brings ACID transactions to your data lakes. It provides serializability, the strongest level of isolation level. Scalable Metadata Handling: In the big data world, even the metadata itself can be "big data". Delta Lake treats metadata just like data, leveraging Spark's distributed processing power to handle all its metadata. As a result, Delta Lake can handle petabyte-scale tables with billions of partitions and files at ease Time Travel (data versioning): Delta Lake provides snapshots of data enabling developers to access and revert to earlier versions of data for audits, rollbacks or to reproduce experiments. Open Format: All data in Delta Lake is stored in Apache Parquet format enabling Delta Lake to leverage the efficient compression and encoding schemes that are native to Parquet 100% Compatible with Apache Spark API: Developers can use Delta Lake with their existing data pipelines with minimal change as it is fully compatible with Spark, the commonly used big data processing engine