In this talk we will discuss how Adatao has successfully built a full-featured, powerful enterprise analytics solution with Spark. Features include web-based reporting/visualization/publishing (“basic analytics”) as well as real-time, interactive data mining and machine learning (“advanced analytics”) on large data sets. What used to take hours are now routinely accomplished in seconds. We will present architecturally how this was accomplished using Spark/Shark/HDFS and other subsystems, with Python and R scriptable front-ends. We will also discuss some use cases where large enterprises are successfully deploying this solution, and lessons learned.
Christopher Nguyen is CEO and co-founder of Arimo, the leader in enterprise big apps. Previously, he served as engineering director of Google Apps and co-founded two successful startups. As a professor, Christopher co-founded the Computer Engineering program at HKUST. He earned his B.S. degree from the University of California-Berkeley summa cum laude, and a Ph.D. from Stanford, where he created the first standard-encoding Vietnamese software suite, authored RFC 1456, and contributed to Unicode 1.1. He is a co-creator of the open-source Distributed DataFrame project http://ddf.io.