From more efficient data processing to streamlined machine learning, the data industry reflects on innovations from the past year and discuss trends to look out for 2019
San Francisco – January 15, 2019 – 2018 was an unmatched year for the tech industry, with several sectors including artificial intelligence (AI), big data and analytics garnering increased investment and innovation. With 90 percent of enterprises already investing in AI technologies, the steady momentum shows immense opportunities for growth – for both technology providers as well as the customers they serve. Databricks, the leader in unified analytics and founded by the original creators of Apache Spark™, sees 2019 as the year that more companies solve the world’s toughest data problems that have hindered AI initiatives across industries. This perspective is shared by data thought leaders who advise on AI, big data and analytics trends that inspired them in 2018, and those on the horizon for 2019:
- Talent Continues to be a Focus for AI: According to Bradley Kent, the AVP of program analytics at LoyaltyOne, the lack of talent is the biggest factor in the path to production. Talent is hard to find, expensive, and often asked to be ‘unicorns’ in their organization. That core issue won’t go away but more vertical-specific solutions will come up and frameworks will seek to automate more of the process.
- Data Processing Still the Biggest Challenge: As an industry we tend to believe that data scientists are spending majority of their time developing models, shares Databricks CEO and co-founder Ali Ghodsi. Truth be told, data processing remains the hardest and most time consuming part of any AI initiative. The highly iterative nature of AI forces data teams to switch between data processing tools and machine learning tools. For organizations to succeed at AI in 2019, they have to leverage a platform that unifies these disparate tools.
- Streamlining Machine Learning Workflow: Machine learning is a data challenge, according to Matei Zaharia, chief technologist and co-founder at Databricks. Large tech companies, with unlimited data, resources, and talent have invested significantly in the development of custom machine learning platforms. But, what about the rest of us? Developing tools to standardize the machine learning process – essentially, making it repeatable regardless of data sets, tools or specific deployment methods – will definitely impact if and when organizations achieve AI.
- AI Gets Leveraged across the Business: AI has been inspiring in showing what’s possible, according to the head of data science at Quby, Stephen Galsworthy. There are numerous examples spanning sectors of how AI can be truly transformative. However, there are continuing business realities and internal scaling and process challenges. So, I see the need for a lot of innovation around the less sexy stuff: Cost optimization tools, automated accounting, and administration of big data/analytics platforms.
- Developing Trust with ‘Explainable AI’: 2018 saw an intensified focus on data bias, trust and transparency in AI – an idea that has implications socially, economically and commercially. According to Mainak Mazumdar, chief research officer at Nielsen, it is critical to develop AI that is explainable, provable and transparent. This journey towards trusted systems truly starts with the quality of data used for AI training. This renewed focus in 2018 on labeled data that can be verified, validated and explained is exciting. It is exciting that ‘Explainable AI’ can lay the foundations for AI systems that can be both generalized across use cases and be trusted.
- Innovations with Real Time Data: Stephen Harrison, a data scientist for Rue Gilt Group says streaming in and of itself is not really brand new. But Rue Gilt Groupe is planning to leverage streaming data for significant innovations in 2019, like real-time recommendations based on up-to-the-minute data from our order management, click tracking, and other systems. This is especially important for us because we’re a flash sale retail site, with products and online browsing and purchase behaviors changing by the minute.
- Deep Learning Pays Dividends: Says Kamelia Aryafar, chief algorithm officer at Overstock, deep learning innovations will create a lot of new AI applications, some of which are already in production and making massive changes in the industry. We’re currently using deep learning on projects, from email campaigns with predictive taxonomies to personalization modules that infer user style with deep learning. Deep learning will continue to improve core AI and machine learning algorithms.
Solving the world’s toughest data problems starts with bringing all of the data teams together within an organization. Data science and engineering teams’ ability to innovate faster has historically been hindered by poor data quality, complex machine learning tool environments, and limited talent pools. Additionally, organizational separation creates friction and slows projects down, becoming an impediment to the highly iterative nature of AI projects. Much like in 2018, organizations that leverage Unified Analytics will have a competitive advantage with the ability to build data pipelines across various siloed data storage systems and to prepare labelled datasets for model building, which allows organizations to do AI on their existing data and iteratively do AI on massive data sets.
For more information about Databricks’ Unified Analytics Platform, visit www.databricks.com.
Databricks’ mission is to accelerate innovation for its customers by unifying Data Science, Engineering and Business. Founded by the original creators of Apache Spark, Databricks provides a Unified Analytics Platform for data science teams to collaborate with data engineering and lines of business to build data products. Users achieve faster time-to-value with Databricks by creating analytic workflows that go from ETL and interactive exploration to production. The company also makes it easier for its users to focus on their data by providing a fully managed, scalable, and secure cloud infrastructure that reduces operational complexity and total cost of ownership. Databricks, venture-backed by Andreessen Horowitz, NEA and Battery Ventures, among others, has a global customer base that includes Viacom, Shell and HP.
Apache, Apache Spark and Spark are trademarks of the Apache Software Foundation.
Head of Communications