データブリックスとアクセンチュアの連携で大規模な機械学習の運用を効率化
Today, we’re excited to announce Databricks’ partnership with Accenture to provide high-value Databricks services and reusable components to enterprise clients globally. Specializing in data strategy and design, data platform modernization and AI, the Accenture data and artificial intelligence (AI) team leverages Databricks’ Unified Data Analytics Platform to streamline proven methodologies for large-scale machine learning deployments....
How to Accelerate Demand Planning From 4.5 Hours to Under 1 Hour With Azure Databricks
The importance of supply chain analytics Rapid changes in consumer purchase behavior can have a material impact on supply chain planning, inventory management, and business results. Accurate forecasts of consumer-driven demand are just the starting point for optimizing profitability and other business outcomes. Swift inventory adjustments across distribution networks are critical to ensure supply meets...
How to Save up to 50 Percent on Azure ETL While Improving Data Quality
The challenges of data quality One of the most common issues our customers face is maintaining high data quality standards, especially as they rapidly increase the volume of data they process, analyze and publish. Data validation, data transformation and de-identification can be complex and time-consuming. As data volumes grow, new downstream use cases and applications...
Leveling the Playing Field: HorovodRunner for Distributed Deep Learning Training
This is a guest post authored by Sr. Staff Data Scientist/User Experience Researcher Jing Pan and Senior Data Scientist Wendao Liu of leading health insurance marketplace eHealth. None generates Taichi; Taichi generates two complementary forces; Two complementary forces generate four aggregates; Four aggregates generate eight trigrams; Eight trigrams determine myriads of phenomena. —Classic of Changes...
Data Access Governance and 3 Signs You Need it
This is a guest authored post by Heather Devane, content marketing manager, Immuta. Cloud data analytics is only as powerful as the ability to access that data for use. Yet, the data stewards responsible for managing data governance often find themselves in a holding pattern, waiting for approval from various stakeholders to operationalize data assets...
Over 200K Enrolled in Databricks’ Certification and Training
More than 200,000 individuals have participated in Databricks' certification and training over the past four years, including thousands of partners. In the past year alone, over 75,000 individuals have been trained and over 1,500 customers and partners have also earned their Databricks Academy Certifications. Today, we are pleased to announce new digital badges so you...
レイクハウスアーキテクチャの実現:高速性、信頼性を備えたオープンアーキテクチャを低価格で
Databricks was founded under the vision of using data to solve the world’s toughest problems. We started by building upon our open source roots in Apache Spark™ and creating a thriving collection of projects, including Delta Lake, MLflow, Koalas and more. We’ve now built a company with over 1,500 employees helping thousands of data teams...
A Step-by-step Guide for Debugging Memory Leaks in Spark Applications
This is a guest authored post by Shivansh Srivastava, software engineer, Disney Streaming Services. It was originally published on Medium.com Just a bit of context We at Disney Streaming Services use Apache Spark across the business and Spark Structured Streaming to develop our pipelines. These applications run on the Databricks Runtime(DBR) environment which is quite...
Top Questions from Our Lakehouse Event
We recently held a virtual event, featuring CEO Ali Ghodsi, that showcased the vision of Lakehouse architecture and how Databricks helps customers make it a reality. Lakehouse is a data platform architecture that implements similar data structures and data management features to those in a data warehouse directly on the low-cost, flexible storage used for...
Handling Late Arriving Dimensions Using a Reconciliation Pattern
This is a guest community post authored by Chaitanya Chandurkar, Senior Software Engineer in the Analytics and Reporting team at McGraw Hill Education. Special thanks to MHE Analytics team members Nick Afshartous, Principal Engineer; Kapil Shrivastava, Engineering Manager; and Steve Stalzer, VP of Engineering / Analytics and Data Science, for their contributions. Processing facts and...