Nick is the Director of Data Engineering at Veraset, a data-as-a-service startup focused on understanding the world from a geospatial perspective. Nick has worked in data engineering and analytics roles for over a decade in a variety of industries, from financial services (Two Sigma, Exodus Point) to cyber security (Blue Voyant) and now geospatial analytics. His experience building, scaling, and supporting distributed data systems includes work on multiple cloud providers, on-prem clusters, proprietary distributed systems, and relational databases.
May 28, 2021 11:40 AM PT
As the data-as-a-service ecosystem continues to evolve, data brokers are faced with an unprecedented challenge - demonstrating the value of their data. Successfully crafting and selling a compelling data product relies on a broker’s ability to differentiate their product from the rest of the market. In smaller or static datasets, measures like row count and cardinality can speak volumes. However, when datasets are in the terabytes or petabytes though - differentiation becomes much difficult. On top of that “data quality” is a somewhat ill-defined term and the definition of a “high quality dataset” can change daily or even hourly.
This breakout session will describe Veraset’s partnership with Databricks, and how we have white labeled Databricks to showcase and accelerate the value of our data. We’ll discuss the challenges that data brokers have faced to date and some of the primitives of our businesses that have guided our direction thus far. We will also actively demo our white label instance and notebook to show how we’ve been able to provide key insights to our customers and reduce the TTFB of data onboarding.