The Azure Cognitive Services on Spark: Clusters with Embedded Intelligent Services

We present the Azure Cognitive Services on Spark, a simple and easy to use extension of the SparkML Library to all Azure Cognitive Services. This integration allows Spark Users to embed cloud intelligence directly into their spark computations, enabling a new generation of intelligent applications on Spark. Furthermore, we show that with our new Containerized Cognitive Services, one can embed cloud intelligence directly into the Spark cluster for ultra-low latency, on-prem, and offline applications.

We show how using our Integration, one can compose these cognitive services with other services, SQL computations, and Deep Networks to create sophisticated and intelligent heterogenous applications. Moreover, we show how to redeploy these compositions as Restful Services with Spark Serving. We will also explore the architecture of these contributions which leverage HTTP on Spark, a novel integration between Spark with the widely used Hypertext Transfer Protocol (HTTP). This library can integrate any framework into the Spark ecosystem that is capable of communicating through HTTP. Finally, we demonstrate how to use these services to create a large class of intelligent applications such as custom search engines, realtime facial recognition systems, and unsupervised object detectors.


About Mark Hamilton

Mark is a software engineer on Microsoft’s Applied AI team and a machine learning PhD student at the MIT Computer Science and AI Lab. Mark leads Microsoft ML for Apache Spark (, a distributed machine learning and microservice orchestration library. He has applied this work to problems in wildlife conservation, accessibility, and art museum outreach. Mark is currently researching how information theory and abstract algebra can yield new deep learning architectures in professor William T Freeman’s lab.

About Anand Raman

Anand is the GM and Chief of Staff for Microsoft AI. Previously he was the Chief of Staff for Microsoft Azure Data Group covering Data Platforms and Machine Learning. In the last decade, he ran the product management and the development teams at Azure Data Services, Visual Studio and Windows Server User Experience teams at Microsoft. Anand holds a PhD in Computational fluid mechanics and worked several years as researcher before joining Microsoft.