Databricks FAQ
Basics
The business knows that there’s gold in all that data, and your team’s job is to find it. But being a detective with a bunch of clunky tools and difficult to setup infrastructure is hard. You want to be the hero who figures out what’s going on with the business, but you’re spending all your time wrestling with the tools.
We built Databricks to make big data simple. Apache Spark™ made a big step towards achieving this mission by providing a unified framework for building data pipelines. Databricks takes this further by providing a zero-management cloud platform built around Spark that delivers 1) fully managed Spark clusters, 2) an interactive workspace for exploration and visualization, 3) a production pipeline scheduler, and 4) a platform for powering your favorite Spark-based applications. So instead of tackling data headaches, you can finally focus on finding answers that make an immediate impact on your business.
Availability
Technical
Databricks currently supports browser-based file uploads, pulling data from Azure Blob Storage, AWS S3, Azure SQL Data Warehouse, Azure Data Lake Store, NoSQL data stores such as Cosmos DB, Cassandra, Elasticsearch, JDBC data sources, HDFS, Sqoop, and a variety of other data sources supported natively by Apache Spark.
Deployment
Databricks is currently available on Microsoft Azure, Amazon AWS and Google Cloud.
Security
Users of Databricks read from and persist data to their own datastores, using their own credentials.