Here at Atigeo, we are always looking for ways to build on, improve, and expand our big data analytics platform, Atigeo xPatterns. More than that, both our development and product management team are focused on big data and on knowing what is right for our customers: data scientists and application developers at companies who are seeking to make the best possible use of their data assets. So we all stay on the lookout for the most useful, advanced, and best-performing set of technologies available.
Apache Spark, for us, was a standout: We could see that making a dramatic performance improvement available to our customers and users would mean that xPattern’s analytics, modeling, and machine learning would be more responsive, and that Spark in xPatterns would give our customers an even quicker path from data ingestion to useful insights.
Our development team works on the 2 nearly opposite sides of the world, here in Bellevue, Washington just outside Seattle, and in Timisoara, Romania. These teams stay in nearly constant touch online, on social media, and even in person. So we all shared this vision of xPatterns enhanced with Spark. We found Spark useful across many stages of our xPatterns big data pipeline — some highlights of which include: ingestion, data transformation, interactive data exploration, and export to NoSQL.
And as we made this vision a reality, with infrastructure improvements and corresponding improvements in data modeling, the Strata conference in Santa Clara, California, lay ahead as a goal and a chance to share what we’d learned and built with the wider community.
And at Strata, sure enough, xPatterns with Spark — as well as Shark, Apache Mesos, and Tachyon — was a big hit with visitors to our booth and on social media.
Even then, we knew there was still more to do: We wanted to provide objective evidence that our integration of Spark would be robust and offer excellent performance across a wide range of Spark distributions, so that users could be sure that the goodness of Spark would be available to them regardless of their choice in Spark distribution. So, we made an enormous investment across our far-flung team for further development, testing, profiling, and logging. After this broad effort, we reached out to Databricks because we saw that their certification program was in line with our vision: to prevent fragmentation and forking in the Spark community and help the Spark ecosystem grow. Additionally it was based on an entirely open process: open source testing tools along with the open source Apache Spark distribution that gave us additional comfort.
So, it is with both pride — as well as thanks to the team at Databricks — that we announce that xPatterns is “Certified on Spark”.