Anand Ranganathan is a co-founder and the Chief AI Officer at Unscrambl, Inc. He is a data scientist, AI engineer, Big Data developer, architect, and researcher rolled into one person. He is leading Unscrambl’s product development in several cutting-edge areas, including natural language processing, conversational analytics, automated insights, real-time optimization and decision-making, and automated marketing optimization. He has worked with numerous customers worldwide to design, implement and deploy Big Data, Stream Processing, and AI-based solutions. Before joining Unscrambl, he was a Global Technical Ambassador, Master Inventor and Research Scientist at IBM. He received his Ph.D. in Computer Science from the University of Illinois Urbana-Champaign, and his BTech from the Indian Institute of Technology Madras. He also has over 70 academic journal and conference publications and 30 patent filings in his name.
May 28, 2021 11:40 AM PT
Apache Spark has been a great technology for processing and analyzing Big Data. However, it is not accessible to business users, who don't have technical or programming skills. In this talk, I'll talk about recent efforts in the space of "Conversational analytics". This paradigm allows any user to ask text and voice questions, in natural language, of their data to a bot and receive back a natural language and visual result. A key technology is natural language to SQL translation, where we translate natural language queries from a user into Spark SQL queries that can go against a Databricks system, and that can be easily trained on different schemas and databases.
This NLP technology needs to be further combined with dialog management, natural-language generation/narration, data understanding and modeling, augmented analytics and automated visualization generation in order to achieve the goal of "Conversational Analytics". Using such a technology, a user can ask, in plain English, "How many cases of Covid were there in the last 2 months in states that had no social distancing mandates by type of transmission", and then dig deeper into the results in a conversational manner to uncover hidden insights from Covid datasets in a Spark instance. We believe that having access to such data and insights at their fingertips can help users make appropriate decisions quickly, improve data literacy and even overcome the scourge of fake news for the general public.