Many people think of machine learning as something that begins with data and ends with a model. But machine learning in practice is actually a continuous process that begins with an application and never ends. Apache Spark has made many parts of this process dramatically easier. As an active member of the Apache Spark Community, we have recognized – through hosting meet-ups, advisory boards, and working with clients – the challenges that practitioners face in closing the loop and adapting automatically to changing business environments. Over the last 12 months we contributed over 25,600 thousand lines of code to Apache Spark including Spark ML, SparkR, and PySpark, and we’ve brought Apache SystemML to 356,000 lines of code, laying the groundwork for machine learning in business solutions and in particular for an end-to-end machine learning framework. In this keynote, I will share our recent progress and where we are headed with machine learning – towards a comprehensive vision for more effectively supporting continuous machine learning.
Dinesh Nirmal is the Vice President of Dataworks Development for IBM Analytics, leading the development of IBM's cloud-based data analytics platform and data solutions, including technologies built on Apache Spark™. He drives the IBM Spark Technology Center to continue wide adoption of Spark across the IBM data and analytics portfolios and the industry in general.