Closing the Loop: Interactive Analysis and Visualization with Spark

Download Slides

One of Spark’s most compelling features is its capability for interactive analytics. Especially for complex data sets, exploration can become richer, faster, and more tactile by combining analytics with interactive visualization. This is particularly relevant in scientific exploration, where any given data set requires many views and many approaches. This talk describes a framework using Spark alongside the open-source visualization server Lightning to both process and visualize data interactively. Workflows can incorporate a variety of Spark libraries, such as Spark Streaming for visualizing streaming machine learning algorithms, and GraphX for displaying graph analyses. The results of user-interactivity within a visualization can immediately feed back into Spark analytics, including live during data streams. And it can all be driven by clients in either Python or Scala. Neuroscientists are using Spark and Lightning side-by-side to analyze large-scale recordings from mice and zebrafish brains, and the same combination promises utility in a wide