As machine learning matures, the standard supervised learning setup is no longer sufficient. Instead of making and serving a single prediction as a function of a data point, machine learning applications increasingly must operate in dynamic environments, react to changes in the environment, and take sequences of actions to accomplish a goal. These modern applications are better framed within the context of reinforcement learning (RL), which deals with learning to operate within an environment. RL-based applications have already led to remarkable results, such as Google’s AlphaGo beating the Go world champion, and are finding their way into self-driving cars, UAVs, and surgical robotics. These applications have very demanding computational requirements--at the high end, they may need to execute millions of tasks per second with millisecond level latencies, and support heterogeneous and dynamic computation graphs. In this talk, we present Ray, a new cluster computing framework that meets these requirements, give some application examples, and discuss how it can be integrated with Apache Spark.