AstraZeneca discovers, develops, and commercializes groundbreaking drugs for some of the world’s most serious diseases. The biggest obstacle to new innovations is the inability to tap into all of the scientific information available to them faster than the pace of new data coming in. They needed a platform that enabled them to build scalable, performant data pipelines that feed machine learning models designed to help their scientists make targeted decisions. With Databricks, they are able to leverage data and machine learning to build a recommendation engine that empowers scientists to more easily uncover new novel drugs quicker, cheaper and more effectively.
It is widely known that the discovery, development, and commercialization of new classes of drugs can take 10-15 years and greater than $5 billion in R&D investment only to see less than 5% of the drugs make it to market. Understanding that this pace of innovation is not sufficient, AstraZeneca moved to a data-driven approach in order to increase their success rate for drug discovery and safer management of clinical trials.
However, their scientists were still unable to quickly make informed decisions with all of the available scientific information at their fingertips. They struggled with data residing in disjointed sources both within the company as well as external public databases. Furthermore, as new scientific research continues to be released at a rapid pace, it became virtually impossible to keep up-to-date with the pace of scientific discovery.
AstraZeneca leverages Databricks Unified Data Analytics Platform on Azure to help build a knowledge graph of biological insights and facts. The graph powers a recommendation system which enables any AstraZeneca scientist to generate novel target hypotheses, for any disease, leveraging all of the data available to them.
Since moving to Databricks, AstraZeneca is now able to more easily process millions of data points from thousands of sources. Removing the barriers of scale has allowed them to more reliably extract meaningful insights that can result in novel drugs designed to help people live healthier lives.
By moving to Databricks, we have seen an order of magnitude improvement in performance.”
– Eliseo Papa, Computational Biologist, AstraZeneca
Technical Talk at Spark + AI Summit EU 2019