We are happy to announce that Elsevier Labs has deployed Databricks as its unified content analysis platform, providing significant productivity gains for the entire team and reducing typical project lengths from weeks to just days.
Elsevier Labs is the advanced R&D group within Elsevier - a global provider of scientific information, publishing over 2,500 journals and 33,000 book titles while building web-based information solutions for professionals in science, technology, and medicine.
They needed a fast and scalable analytics platform to develop new methods to extract insights from the published content. Their development process frequently required the application of complex natural language processing (NLP) algorithms to millions of articles and interpretation of the results. Prior to Databricks, Elsevier Labs’ productivity was severely hampered because:
- There was substantial manual data movement during the analytics workflow
- The steep learning curve of their legacy analytics platform prevented code reuse
- Presenting findings required significant additional time to build reports and UIs
Databricks enabled Elsevier Labs to effortlessly manage Apache Spark clusters, access their data, collaboratively develop cutting-edge algorithms, and present their findings in a single platform. With the Databricks integrated workspace, the Elsevier Labs team was able to:
- Create, scale, and terminate Spark clusters without specialists with big data DevOps expertise.
- Directly access data in S3 buckets and collaboratively perform analysis in a notebook environment, using Python, Scala, SQL, or even R.
- Present findings to senior management and share results across the entire organization with account-based access control of notebooks.
As a result of deploying Databricks, Elsevier Labs enabled five times more people to contribute to content analysis algorithm development, growing the number of contributors from 3 to 15. Moreover, the people who use Databricks are significantly more productive, reducing typical project lengths from weeks to just days.
To try out Databricks for yourself, sign-up for a 14-day free trial today!