Shengsheng (Shane) Huang is a senior software architect at Intel, with 10+ years of experience in Big Data and 5+ years of experience in AI. She is an Apache Spark committer and PMC member, and is a key contributor to open source Big Data + AI projects Analytics-zoo (https://github.com/intel-analytics/analytics-zoo) and BigDL(https://github.com/intel-analytics/BigDL). Â Now at Intel, she leads development of algorithms and customer applications focusing on NLP, AutoML and time series analysis.
Time Series Forecasting is widely used in real world applications, such as network quality analysis in Telcos, log analysis for data center operations, predictive maintenance for high-value equipment, etc. Classical time series forecasting methods (such as autoregression and exponential smoothing) often involve making assumptions the underlying distribution of the data, while new machine learning methods, especially neural networks often perceive time series forecasting as a sequence modeling problem and have recently been applied to these problems with success (e.g., [1] and [2]). However, building the machine learning applications for time series forecasting can be a laborious and knowledge-intensive process. In order to provide an easy-to-use time series forecasting toolkit, we have applied Automated Machine Learning (AutoML) to time series forecasting. The toolkit is built on top of Ray (a distributed framework for emerging AI applications open-sourced by UC Berkeley RISELab), so as to automate the process of feature generation and selection, model selection and hyper-parameter tuning in a distributed fashion. In this talk we will share how we build the AutoML toolkit for time series forecasting, as well as real-world experience and 'war stories' of earlier users (such as Tencent). References: