Skip to main content
Company Blog

For years Hadoop was the default technology for big data analytics. But over time, it has fallen behind as new technologies have been introduced to provide better analytics solutions. Many organizations are looking at their Hadoop costs and trying to justify migrating to a modern cloud-based analytics platform. Databricks just released the whitepaper, “The Hidden Value of Hadoop Migration.” This whitepaper shows how practitioners can frame the business value of migrating to Hadoop to, for example, convince their leadership team to invest in switching over.

Databricks provides $13.8M+ potential value

There are three key topics in the whitepaper:

  1. Moving to a cloud-based analytics platform to reduce the operational costs of licenses and maintenance.
  2. The power of a modern cloud-based analytics platform to address more advanced use cases with huge business impact. This is the Hidden Value of Hadoop Migration.
  3. Avoiding the temptation to migrate your Hadoop platform in the cloud increases migration costs and delays the migration’s productivity and innovation gains.

Recognizing the true cost of ownership

Many organizations initially look to migrate from Hadoop on-premises to lower their total cost of ownership (TCO). The focus is on the licensing cost of their Hadoop system, which alone makes a compelling case to migrate. Yet, getting internal funding can still be challenging since surface-level analysis doesn’t accurately depict just how costly Hadoop can be—and how valuable migrating to cloud-based solutions are. To get a real sense of what Hadoop costs your organization, you have to step back.

Licensing costs are usually the smallest component of the TCO. From a benchmark of Databricks customers, we found that licensing is less than 15% of the total cost, whereas data center management is nearly 50%. Additional costs include:

  • Hardware overcapacity is a given in on-premises implementations so that you can scale up to your largest needs, but much of that capacity sits idle most of the time.
  • Scaling costs add up fast. The ability to separate storage and compute does not exist in an on-premises Hadoop implementation, so costs grow as datasets grow. On top of that, to achieve significant insights, organizations need big datasets that cross many different sources, so this cost is part of the process.
  • DevOps burden is another factor. Based on our customers’ experience, you can assume 4-8 full-time employees for every 100 nodes.
  • Power costs can be as much as $800 per server per year based on consumption and cooling. That’s $80K per year for a 100 node Hadoop cluster!
  • Purchasing new and replacement hardware accounts for ~20% of TCO—that’s equal to the Hadoop clusters’ administration.

When the costs are all factored in, migration becomes an obvious decision.

But the power of a modern cloud-based analytics platform to address more advanced use cases and increased productivity is the Hidden Value of Hadoop Migration. By having all your data, analytics and AI on one unified data platform, you can combine the best of data warehouses and data lakes into a lakehouse architecture, enabling collaboration on all of your data, analytics and AI workloads. It makes your data processes faster and more streamlined, improves the productivity and collaboration of your data teams, and gives you the scale to handle game-changing machine learning use cases. These use cases can increase revenue, reduce costs, and mitigate risk. Companies that do not adopt a modern cloud-based analytics platform will find themselves falling further and further behind organizations that invest in using data to drive their business.

Calculating the business cost of legacy technology

A typical path is to try to take the Hadoop experience and recreate it in the cloud. But it brings the same problems in terms of performance limitations, use case limitations, and employee productivity. Databricks is built to tackle these challenges while driving value for customers who migrate off Hadoop in these three areas: infrastructure, productivity, and business impact. You can learn more about our approach and real-life examples in the white paper.

Determining these hidden costs isn’t always intuitive. To help, we created a cost framework for developing estimates on the impact migration will have on your organization.

databricks hadoop migration blog image

What's next

This blog just touched the surface of how Databricks can drastically cut down costs and boost your team’s business value. To dive deeper into these topics and real-life customer examples, read   “The Hidden Value of Hadoop Migration.”  For more information, visit www.databricks.com/migration.