Managing high-value equipment deployed across operational sites is a common challenge for construction firms. In response, many original equipment manufacturers are connecting equipment with the Internet of Things, creating new opportunities for digital solutions that drive efficiency across the project lifecycle. According to a 2017 report by McKinsey, technology-driven solutions could improve cross-industry productivity by as much as 60%. Understanding the real-time distribution of equipment can help fleet managers reduce downtime and improve equipment utilization. By leveraging GPS tracking and geospatial analytics, companies can make data-driven decisions about equipment deployment, maintenance scheduling, and resource allocation across work sites.
Delivering real-time results leveraging geospatial data can be difficult and requires complex processing. One common challenge is determining if an asset is operating within a jobsite. Databricks offers the ability to mix several geospatial capabilities together in Delta Live Tables to stream results from point-in-polygon lookups over thousands of sites. Using product APIs for H3 geospatial indexing as well as Spatial Temporal (ST) functions, currently in preview, we can implement the point-in-polygon geospatial “hybrid” join pattern to map equipment locations to specific operational sites with great scalability and accuracy. Once an equipment or fleet manager has a view of each asset’s location, they can calculate statistical insights or reports to help them drive efficient maintenance scheduling, reduce transit and downtime, or dispatch equipment to under-resourced locations.
H3 is an open-source geospatial indexing system that divides the Earth into uniform hexagonal cells, each with a unique identifier. Its precision and high scalability makes it ideal for geospatial data analysis.
Before we take a look at an example DLT pipeline, let’s visualize our equipment locations and operational site boundaries. The points represent our equipment, the polygons are jobsites, and maintenance sites are circles.
This DLT pipeline creates an hourly streaming calculation that shows the percentage of total assets deployed to a jobsite, maintenance site, or in transit between sites. This will allow us to monitor the overall utilization of our equipment fleet.
Our first streaming table groups GPS tracking data into hourly windows and selects the last known latitude and longitude position for each piece of equipment.
Now that we have the last location of each asset per hour, we can implement the point-in-polygon join pattern using H3 geospatial indexing to map our assets onto operational sites. In addition, we are using a set of ST functions also provided by Databricks.
The first step is to assign H3 indices to both the GPS coordinates of assets and the polygon boundaries representing operational sites.
The next step is to perform a join operation between the assets and sites based on their H3 cell ID:
Joining on H3 cell ID removes the need for running a compute intensive geospatial operation on every record.
Cells categorized as boundary (core = false) require a precise geometric check because the h3 cell is not completely within the site geometry. We can perform the point-in-polygon check using st_contains. This ensures that only points truly inside the site boundary are included in the join results, eliminating false positives caused by the granularity of the resolution.
Finally, for the last streaming table in our DLT pipeline, we calculate the distribution of assets across different site types. We use a select expression to count the total number of assets per window, the assets at each site_type, and finally a percentage of the total assets reporting telemetry in each hourly window.
By combining Delta Live Tables with H3 geospatial indexing, Spatial Temporal functions, and the point-in-polygon “hybrid” join pattern, we can efficiently map equipment locations to operational sites and calculate fleet distribution metrics. This approach simplifies spatial operations while maintaining accuracy, making it ideal for real-time geospatial analytics at scale in industries like construction.
Check out our upcoming blogs in this series covering real-time monitoring of landmark entries and exits with stateful streaming and “geospatial agent”, which integrates geospatial intelligence into Mosaic AI Agent framework for real-time delivery tracking.
To learn more about the origins of Geospatial Analytics with H3 on Databricks, check out Spatial Analytics at Any Scale With H3 and Photon. And stay tuned for advancements around Databricks support for ST functions as well as geometry and geography types.