Finding policies that lead to optimal outcomes for an organization are some of the most difficult challenges facing decision makers within an organization. The reason for it is the fact that policies are not made in a world with perfect information and markets in equilibrium. These are complex systems where the behavior of entities within the system are dynamic and generally uncertain. Reinforcement Learning (RL) has gained popularity for modeling complex behavior to identify optimal strategy. RL maps states or situations to actions in order to maximize some result or reward. The Markov Decision Process (MDP) is a core component of the RL methodology. The Markov chain is a probabilistic model that uses the current state to predict the next state. This presentation discusses using PySpark to scale an MDP example problem. When simulating complex systems, it can be very challenging to scale to large numbers of agents, due to the amount of processing that needs to be performed in memory as each agent goes through a permutation. PySpark allows us to leverage Spark for the distributed data processing and Python to define the states and actions of the agents.
Justin Brandenburg is a Resident Machine Learning Engineer in Databricks Professional Services. Justin has experience in a number of data areas ranging from counter-narcotics to cyber intrusion analysis. In past projects, he has utilized machine learning, econometrics, graph analytics and agent-based modeling to fulfill the customer needs. He has an undergraduate degree in Economics from Va Tech, a Masters in Economics from Johns Hopkins University and a Masters in Computational Social Science from George Mason University.