This is a collaborative post by Databricks and CARTO. We thank Javier de la Torre, Founder and Chief Strategy Officer at CARTO for his contributions.
Today, CARTO is announcing the beta launch of their new product called the Spatial Extension for Databricks, which provides a simple installation and seamless integration with the Databricks Lakehouse platform, and with it, a broad and powerful collection of spatial analysis capabilities on the lakehouse. CARTO is a cloud-native geospatial technology company, working with some of the world’s largest companies to drive location intelligence and insights.
One of our joint customers, JLL, is already leveraging the spatial capabilities of CARTO and the power and scale of Databricks. As a world leader in real estate services, JLL manages 4.6 billion square feet of property and facilities and handles 37,500 leasing transactions globally. Analyzing and understanding location data is a fundamental driver for JLL’s success, and allows them to service the needs of the most advanced spatial data scientists and real estate consultants in the field.
To leverage location data and analytics across the entire organization, their data team needs to service the most advanced spatial data scientists and real estate consultants in the field.
JLL turned to CARTO to develop some of their solutions (Gea, Valorem, Pix, CMQ), which would be used by their consultants for market analysis and property valuation. The solutions required market localization for multiple countries across the globe; access to big data and data science (using Databricks), as well as a rich user experience, were key priorities to ensure consultants would adopt the tool in their day-to-day.
By leveraging CARTO and Databricks together, JLL is able to provide an incredibly advanced infrastructure for data scientists to perform data modeling on the fly, as well as a platform to easily build solutions for stakeholders across the business. By unlocking first-class map-based data visualizations and data pipeline solutions through a single platform, JLL is able to decrease complexity, save time (and therefore human resources) and avoid mistakes in the DataOps, DevOps and GISops processes.
The solution has led to faster deliveries on client mandates, extended consultant knowledge (beyond their traditional in-depth knowledge of their regions) and brand positioning for JLL as a highly data-driven and location-aware firm in the real estate industry. Discover the specifics by downloading the case study.
“CARTO Spatial Extension for Databricks represents a huge advance on spatial platforms. With cloud native-push down queries to the Databricks Lakehouse platform, we have now the best analytics and mapping platform working together. With the volumes of data we are operating right now, no other solution could match the performance and convenience of this cloud native approach.” – Elena Rivas – Head of Engineering & Data Science at JLL
Bringing fully cloud-native spatial analytics to Databricks
CARTO extends Databricks to enable spatial workflows natively by enabling users to:
- Import spatial data into Databricks using many spatial data formats, such as geoJSON, shapefiles, kml, .csv, GeoPackages and more.
- Perform spatial analytics using Spatial SQL similar to PostGIS, but with the scalability of Apache SparkTM.
- Use CARTO Builder to create insightful maps from SQL, style and explore these geovisualizations with a full cartographic tool.
- 4.Build map applications on top of Databricks using Google Maps or other providers, combined with the power of the deck.gl visualization library.
- 5.Access more than 10,000 curated location datasets, such as demographics, human mobility or weather data to enrich your spatial analysis or apps using Delta Sharing.
Spatially extended with Geomesa and the CARTO Analytics toolbox
CARTO extends Databricks using User Defined Functions (UDF) to add spatial support. Over the last few months, the team at CARTO and Azavea have been working on creating a new Open Source library called the CARTO Analytics Toolbox that exposes Geomesa spatial functionality in a set of Spatial UDFs. Think of PostGIS for Spark.
CARTO needs to have this library available on your Databricks cluster to push down spatial queries. Check out the documentation on how to install the Analytics Toolbox in your cluster.
Now that we have spatial support in our cluster we can go to CARTO and connect it. You do so by navigating to the connections section and filling in the details for your ODBC connection.
Write SQL, get maps
Once connected to Databricks we can explore spatial data or build a map from scratch. In CARTO you create a map by adding layers defined in SQL. This SQL is executed on a Databricks cluster dynamically – if data changes, the map updates automatically. Internally, CARTO checks the size of the geographic data and decides the most effective way to transfer data, either as a single document or as a set of tiles.
Building map applications on top of Databricks
Customers like JLL very often build custom spatial applications that simplify either a spatial analysis use case or provide a more direct interface to access business intelligence or information. CARTO facilitates the creation of these apps with a complete set of development libraries and APIs.
For visualization, CARTO makes use of the powerful deck.gl visualization library. You utilize CARTO Builder to design your maps and then you reference them in your code. CARTO will handle visualizing large datasets, updating the maps, and everything in between.
Everything happens somewhere
Location is a fundamental dimension for many different analytical workflows. You can find it in many use cases in pretty much every vertical. Here’s just a sample of the kinds of things CARTO customers have been doing with Spatial Analytics.
Towards full cloud-native support of CARTO in Databricks
Many of the largest organizations using CARTO leverage Databricks for their analytics. With the power of Spark and Delta Lake, connected with CARTO, it is now possible to push down all spatial workflows to Databricks clusters. We see this as a major step forward for Spatial Analytics using Big Data.
With this beta release of the CARTO Spatial Extension we are providing the fundamental building blocks for Location Intelligence in Databricks. If you work with an external GIS (geographic information system) in parallel with Databricks, this integration will provide you the best of both worlds.
Get started with Spatial Analytics in Databricks
If you would like to test drive the beta CARTO Spatial Extension for Databricks, sign up for a free 14-day trial today.
At Databricks, we’re excited to work with CARTO and supercharge geospatial analysis at scale. This collaboration opens up location-based analysis workflows for users of our Lakehouse Platform to drive even better decisions across verticals and for a wealth of use cases.
If you work with geospatial data, you will be interested in the upcoming webinar Geospatial Analysis and AI at Scale hosted by Databricks, Tuesday, December 14th. Register now.