Histogram Equalized Heat Maps from Log Data via Apache Spark

Download Slides

Reverse geocoding is one of HERE Technologies most heavily used services. From its access logs geocodes can be extracted and then counted with respect to some cellulation of the earth–-creating a sparse heat map. For our Place & Address Search products we use such a heat map to define a notion of relative place importance to rank and index addresses and places. However, the large data size, sparsity, and variations in traffic from which different global heat maps may be derived, makes faithful visualization and comparison a challenge. Additionally, common implementations of spatial image processing techniques that can help address the aforementioned challenges don’t map directly onto Spark’s computing engine.
In this talk Arvind Rao will describe implementations of histogram equalization and kernel-based sparse image processing methods on Spark. Histogram equalization, which is best known as a method of contrast enhancement, automatically normalizes images, facilitating comparison. Along the way, Arvind will talk about how HERE uses heat maps as a feature in their autocompletion service, and say just enough about perception of contrast to put histogram equalization in context.

Session hashtag: #EUds13

About Arvind Rao

Arvind is a software engineer on the Place & Address Search team at HERE Technologies. HERE's search products produce voluminous logs; his work is focused on analytics and data mining on these log data. Incidentally, Arvind has a PhD in mathematics.