Scalable AutoML for Time Series Forecasting using Ray

Download Slides

Time Series Forecasting is widely used in real world applications, such as network quality analysis in Telcos, log analysis for data center operations, predictive maintenance for high-value equipment, etc. Classical time series forecasting methods (such as autoregression and exponential smoothing) often involve making assumptions the underlying distribution of the data, while new machine learning methods, especially neural networks often perceive time series forecasting as a sequence modeling problem and have recently been applied to these problems with success (e.g., [1] and [2]). However, building the machine learning applications for time series forecasting can be a laborious and knowledge-intensive process. In order to provide an easy-to-use time series forecasting toolkit, we have applied Automated Machine Learning (AutoML) to time series forecasting. The toolkit is built on top of Ray (a distributed framework for emerging AI applications open-sourced by UC Berkeley RISELab), so as to automate the process of feature generation and selection, model selection and hyper-parameter tuning in a distributed fashion. In this talk we will share how we build the AutoML toolkit for time series forecasting, as well as real-world experience and ‘war stories’ of earlier users (such as Tencent). References:

  1. Guokun Lai, Wei-Cheng Chang, Yiming Yang, Hanxiao Liu. ‘Modeling Long- and Short-Term Temporal Patterns with Deep Neural Networks’
  2. Nikolay Laptev, Slawek Smyl, Santhosh Shanmugam. ‘Engineering Extreme Event Forecasting at Uber with Recurrent Neural Networks’

 
Try Databricks
« back
About Shengsheng Huang

Intel

Shengsheng (Shane) Huang is a senior software architect at Intel, with 10+ years of experience in Big Data and 5+ years of experience in AI. She is an Apache Spark committer and PMC member, and is a key contributor to open source Big Data + AI projects Analytics-zoo (https://github.com/intel-analytics/analytics-zoo) and BigDL(https://github.com/intel-analytics/BigDL).  Now at Intel, she leads development of algorithms and customer applications focusing on NLP, AutoML and time series analysis.

About Jason Dai

Intel Corporation

Jason Dai is a senior principal engineer and CTO of Big Data Technologies at Intel, responsible for leading the global engineering teams (in both Silicon Valley and Shanghai) on the development of advanced data analytics and machine learning. He is the creator of BigDL and Analytics Zoo, a founding committer and PMC member of Apache Spark, and a mentor of Apache MXNet. For more details, please see https://jason-dai.github.io/.