Databricks and Informatica Integration Simplifies Data Lineage and Governance for Cloud Analytics
In a rapidly evolving world of big data, data discovery, governance and data lineage is an essential aspect of data management. As organizations modernize their workloads into multi-cloud and hybrid environments, data starts to get distributed across cloud data lakes and SaaS applications. With that, organizations are trying to answer key questions: How do I...
How Databricks and Privacera Combine to Secure Data for Cloud Analytics
In their quest to anticipate customer needs, forward looking organizations are looking to use cloud-based analytics and AI to innovate. But we often hear from customers how challenging it is to manipulate large data volumes in a secure and compliant way. Databricks and Privacera have partnered to help customers address several key use cases for...
Automate and Fast-track Data Lake and Cloud ETL with Databricks and StreamSets
Data lake ingestion is a critical component of a modern data infrastructure. But enterprises often run into challenges when they have to use this data for analytics and machine learning workloads. Consolidating high volumes of data from disparate sources into a data lake is difficult, even more so if it is from both batch and...
Solving the Challenge of Big Data Cloud Migration with WANdisco, Databricks and Delta Lake
This is a guest blog from Paul Scott-Murphy, VP of Product Management, Big Data / Cloud at WANdisco. Migrating from Hadoop on-premises to the cloud has been a common theme in recent Databricks blog posts and conference sessions. They’ve identified key considerations, highlighted partnerships and described solutions for moving and streaming data to the cloud...
Simplify Data Lake Access with Azure AD Credential Passthrough
Azure Databricks brings together the best of the Apache Spark, Delta Lake, an Azure cloud. The close partnership provides integrations with Azure services, including Azure’s cloud-based role-based access control, Azure Active Directory(AAD), and Azure’s cloud storage Azure Data Lake Storage (ADLS). Even with these close integrations, data access control continues to prove a challenge for...
How Informatica Data Engineering Goes Hadoop-less with Databricks
Back in May, we announced our partnership with Informatica to build out a rich set of integrations between our two platforms. It’s been exciting work for the team because of what we can do for joint customers that combine our Managed Delta Lake with Informatica’s Big Data Management and Enterprise Data Catalog. The vision led...
Guest Blog: Using Databricks, MLflow, and Amazon SageMaker at Brandless to Bring Recommendation Systems to Production
This is a guest blog from Adam Barnhard, Head of Data at Brandless, Inc., and Bing Liang, Data Scientist at Brandless, Inc. Launched in July 2017, Brandless makes hundreds of high-quality items, curated for every member of your family and room of your home, and all sold at more accessible price points than similar products on the market. We...
Using ML and Azure to improve Customer Lifetime Value: On-Demand Webinar and FAQ Now Available!
On July 18th, we hosted a live webinar —Using ML and Azure to improve customer lifetime value - with Rob Saker, Industry Leader - Retail Industry, Colby Ford, Associate Faulty - School of Data Science, UNC Charlotte and Navin Albert, Solutions Marketing Manager at Databricks. This blog has a recording of the webinar and some of...
Migrating Transactional Data to a Delta Lake using AWS DMS
Note: We also recommend you read Efficient Upserts into Data Lakes with Databricks Delta which explains the use of MERGE command to do efficient upserts and deletes. Challenges with moving data from databases to data lakes Large enterprises are moving transactional data from scattered data marts in heterogeneous locations to a centralized data lake. Business data...
Databricks is a Diamond Partner at Snowflake Summit
Here at Databricks, we are excited to participate in the first Snowflake Summit as a Diamond Partner. The event takes place June 3-6 at the Hilton San Francisco Union Square and is another great opportunity to share how Databricks and Snowflake have partnered together to provide: Massively scalable data pipelines. Pipelines running on Databricks can...