Session
Spark on Databricks: Tips and Tricks
Overview
Experience | In Person |
---|---|
Type | Breakout |
Track | Data Engineering and Streaming |
Industry | Enterprise Technology |
Technologies | Apache Spark |
Skill Level | Advanced |
Duration | 40 min |
This session explores a collection of advanced and lesser-known use cases in Apache Spark™, drawn from real-world scenarios and internal experimentation.
Topics include:
- Restarting individual streams without restarting the entire cluster
- Priming schemas to handle schema evolution more effectively
- Demultiplexing events for cleaner, more scalable stream processing
- Using the Delta Kernel directly from Scala and Jupyter notebooks
- Key considerations and pitfalls when benchmarking Spark workloads
We'll also cover additional patterns and tooling tips that can help solve operational challenges and optimize performance in production Spark environments.
Session Speakers
IMAGE COMING SOON
Dattatraya Walake
/SSA
Databricks
IMAGE COMING SOON
Murali Talluri
/Specialist Solutions Architect
Databricks