R Tyler Croy

Director Of Platform Engineering, Scribd

R. Tyler Croy leads the Platform Engineering organization at Scribd and has been an open source developer for over 14 years. His open source work has been in the FreeBSD, Python, Ruby, Puppet, Jenkins, and now Delta Lake communities. The Platform Engineering team at Scribd has invested heavily in Delta and has been building new open source projects to expand the reach of Delta Lake across the organization.

Past sessions

In this session we will introduce the delta-rs project which is helping bring the power of Delta Lake outside of the Spark ecosystem. By providing a foundational Delta Lake library in Rust, delta-rs can enable native bindings in Python, Ruby, Golang, and more.We will review what functionality delta-rs supports in its current Rust and Python APIs and the upcoming roadmap.

We will also give an overview of one of the first projects to use it in production: kafka-delta-ingest, which builds on delta-rs to provide a high throughput service to bring data from Kafka into Delta Lake.

In this session watch:
R Tyler Croy, Director Of Platform Engineering, Scribd

[daisna21-sessions-od]

Summit Europe 2020 From Hadoop to Delta Lake and Glue for Streaming and Batch

November 17, 2020 04:00 PM PT

The modern data customer wants data now. Batch workloads are not going anywhere, but at Scribd the future of our data platform requires more and more streaming data sets. As such our new data platform built around AWS, Delta Lake, and Databricks must simultaneously support hundreds of batch workloads, in addition to dozens of new data streams, stream processing, and stream/ad-hoc workloads.

In this session we will share the progress of our transition into a streaming cloud-based data platform, and how some key technology decisions like adopting Delta Lake have unlocked previously unknown capabilities our internal customers enjoy. In the process, we’ll share some of the pitfalls and caveats from what we have learned along the way, which will help your organization adopt more data streams in the future.

Speakers: R Tyler Croy and Brian Dirking

Summit 2020 The Revolution Will be Streamed

June 25, 2020 05:00 PM PT

The modern data customer wants data now. Batch workloads are not going anywhere, but at Scribd the future of our data platform requires more and more streaming data sets. As such our new data platform built around AWS, Delta Lake, and Databricks must simultaneously support hundreds of batch workloads, in addition to dozens of new data streams, stream processing, and stream/ad-hoc workloads. In this session we will share the progress of our transition into a streaming cloud-based data platform, and how some key technology decisions like adopting Delta Lake have unlocked previously unknown capabilities our internal customers enjoy. In the process, we'll share some of the pitfalls and caveats from what we have learned along the way, which will help your organization adopt more data streams in the future.