Sandy is the original proponent of Databricks on Azure and a posterchild of the Azure Databricks revolution. As a contributor to Apache Parquet, Sandy has been a strong advocate of big data technologies. He is currently a lead consultant for Elastacloud and engineering lead for Renewables.AI, a wholly cloud-based renewables platform, with offices in the UK. With a strong academic background in robotics and AI, Sandy is a leading light in using Spark to process at scale IoT data. His most recent open source project SparkImputations brings the imputation power of R-libraries to Apache Spark.
Renewables AI is at the forefront of innovation in the solar energy market. As the name suggests, we use AI to make predictions on energy output from large portfolios of solar farms. This talk lays out the fundamental architecture, technology and approaches that make the platform work beginning with key features of the Azure Databricks cloud and how it works seamlessly with Azure Data Lake and Azure Event Hubs. There will be good coverage of ML and DL Pipelines and how they are used with image recognition and machine learning through Structured Streaming to make real-time decisions. Key Takeaways: Prediction of next day irradiance and power ratios with real-time accuracies of 95% Structured streaming of IoT data from hundreds of thousands of inverters at 5 minute intervals Real-time joining of weather data and several other external datasets Use of Deep Learning Pipelines and advanced time series methods to predict 48 hours of future energy production Near-real time processing of image data at frequent intervals to predict cloud cover from onsite cameras and drones Analysis of data and preventative maintenance of fan failures in solar inverters Session hashtag: #SAISDD11