SPARK + AI SUMMIT | SAN FRANCISCO, CA
AGENDA IS NOW LIVE
June 22 - 26, 2020
previous arrow
next arrow
Slider

Filter:

Building Reliable Data Lakes with Delta Lake | Virtual Hands-on Lab (Rockies)

Community Event

Virtual Event

Join this virtual hands-on lab to learn how Delta Lake can help you build robust production data pipelines at scale. Delta Lake is an open-source storage layer that brings reliability to data lakes. It has numerous reliability features including ACID transactions, scalable metadata handling, and unified streaming and batch data processing. It also offers DML commands to update, delete, and merge data for your data lifecycle, such as for GDPR/CCPA. Delta Lake runs on top of your existing data lake, such as on Azure Data Lake Storage, AWS S3, Hadoop HDFS, or on-premise, and is fully compatible with Apache Spark APIs.

Unified Data Analytics Workshop with Microsoft (Southeast)

Community Event

Virtual Event

In this virtual workshop, we’ll cover best practices for enterprises to use powerful open source technologies to simplify and scale your data and ML efforts. We’ll discuss how to leverage Apache Spark™, the de-facto data processing and analytics engine in enterprises today, for data preparation as it unifies data at massive scale across various sources. You’ll learn how to use ML frameworks (i.e. TensorFlow, XGBoost, Scikit-Learn, etc.) to train models based on different requirements.

Using a Wide Variety of Data to Drive Better Insights

Regional Event

Virtual Event

A new breed of Financial Services companies are now incorporating new, external, real-time sources of data to make financial decisions: Smart Asset Managers are “Now-casting” what’s happening in their portfolio companies to generate superior alpha and banks are analyzing broader sets of data for superior underwriting performance and upsell/cross-sell. Insurance companies now predict claim expenses instead of reacting to them. These days, there are outstanding vendors offering access to valuable alternative data and 3rd party datasets, including geolocation, social media, and transaction datasets. With the increased availability of high-quality alternative data, the opportunities for competitive advantages are in the efficient storing of data, cleaning it, and combining insights across disparate datasets. In this webinar, you’ll learn how to start with a sample ticker (e.g. plant-based meat stock), optimize storage of alternative data sources related to this ticker, and analyze ticker-related foot traffic data to guide investment decisions and increase to alpha.

Succeeding with a Modern Cloud Data Architecture

Webinar

Virtual Event

There are many layers in a modern cloud data architecture, but two layers stand out because they determine success or failure: the cloud data platform and cloud data integration. This webinar by Databricks and Fivetran will drill into how these two layers work together to create a successful modern cloud data architecture.

Unified Data Analytics Workshop with Microsoft (Latin America)

Community Event

Virtual Event

In this virtual workshop, we’ll cover best practices for enterprises to use powerful open source technologies to simplify and scale your data and ML efforts. We’ll discuss how to leverage Apache Spark™, the de-facto data processing and analytics engine in enterprises today, for data preparation as it unifies data at massive scale across various sources. You’ll learn how to use ML frameworks (i.e. TensorFlow, XGBoost, Scikit-Learn, etc.) to train models based on different requirements.

Building Reliable Data Lakes with Delta Lake | Virtual Hands-on Lab (N. East/E. Canada)

Community Event

Virtual Event

Join this virtual hands-on lab to learn how Delta Lake can help you build robust production data pipelines at scale. Delta Lake is an open-source storage layer that brings reliability to data lakes. It has numerous reliability features including ACID transactions, scalable metadata handling, and unified streaming and batch data processing. It also offers DML commands to update, delete, and merge data for your data lifecycle, such as for GDPR/CCPA. Delta Lake runs on top of your existing data lake, such as on Azure Data Lake Storage, AWS S3, Hadoop HDFS, or on-premise, and is fully compatible with Apache Spark APIs.

Delta Lake Webinar – Spain

Webinar

Virtual Event

Este webinar le dará la oportunidad de: - Comprender Delta Lake, la capa de almacenamiento open source que hace que los Data Lakes sean fiables - Aprender a construir pipelines de datos altamente escalables y fiables usando Delta Lake - Ver cómo otros profesionales de datos se han beneficiado de Delta Lake - Hacerle preguntas al experto de Databricks sobre tus más grandes desafíos en el mundo de los datos

Virtual meetup: Building a Reliable Data Lake & What’s New in Spark 3.0

Meetup

Tel Aviv

Join Daniel Haviv, Solution Architect at Databricks and Tal Sharon, staff Big Data Engineer at Intuit, in a virtual meetup where we will deep dive on topics ranging from how to build a reliable data lake to all the latest in the realm of Spark 3.0.

Migrating on premises Hadoop to a Cloud Data Lake

Webinar

Webinar

Companies count on their data and analytics platforms, as the foundation of their innovation and digital transformation strategy. However, many on-premises Hadoop users struggle with its high costs, system complexity, unscalable infrastructure, and DevOps burden. In this webinar, we’ll cover why companies are switching to modern cloud-based platforms like Databricks on AWS, and how they use it to drive innovation, productivity, business outcomes and reduce TCO of their data lake. We’ll also share a best practice framework for how to successfully migrate data and workloads to the cloud safely and securely.

Maximizing Customer Value | Best practices from Retail, CPG, & Media

Community Event

Virtual Event

In this interactive virtual workshop, learn from companies who have successfully increased customer value through advanced segmentation using the Databricks Unified Data Analytics Platform to simplify how data is processed and analyzed for CLV exercises. Then Bryan Smith, Databricks Global Tech Lead for Retail, will walk through three different ways of calculating CLV using retail data, though this will be applicable across all industries looking to understand the value of each customer using historical behavioral patterns.