Gray is a member of the Resident Solutions Architect team at Databricks. He works directly Databricks strategic customers to build big data cloud architectures, Spark solutions and everything else in between. Gray received his BS/MS in Computer Science from Grand Valley State University, and served as a Hadoop System Architect at Ford Motor Company before joining Databricks.
May 27, 2021 12:10 PM PT
With data as a valuable currency and the architecture of reliable, scalable Data Lakes and Lakehouses continuing to mature, it is crucial that machine learning training and deployment techniques keep up to realize value. Reproducibility, efficiency, and governance in training and production environments rest on the shoulders of both point in time snapshots of the data and a governing mechanism to regulate, track, and make best use of associated metadata.
This talk will outline the challenges and importance of building and maintaining reproducible, efficient, and governed machine learning solutions as well as posing solutions built on open source technologies - namely Delta Lake for data versioning and MLflow for efficiency and governance.