What is the Future of Fabwelt Token?

Did you know? That you can clap up to 50x. By clapping up to 50x per person, you are helping the medium algorithm to spread the news about Kommunitas. Kommunitas is a decentralized and tier-less…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Unifying Raster and Vector data with Mosaic

Rasters are a compelling set of data formats. Data assets describing climate data, weather data, satellite data, flood regions, soil data, harvest data, and many more are often supplied as raster data. The trouble is that most data scientists are not trained to handle raster formats. In most cases, barriers like obtaining requisite GIS expertise and complexities of advanced frameworks such as GDAL, create a “paywall” that hides available insights from those seeking to work with raster data. However, the value inside these assets can help us address use cases such as climate risk, flood risk, ESG, alternative energy site planning; the list goes on and on. How can we bring these data assets closer to data scientist and analyst communities and decouple from GIS expertise?

The code above illustrates how we can read a band of raster data using GDAL and a filename. It is important to note that we aren’t explicitly providing the format, and we expect GDAL to infer this from the file extension. This can be a massive enabler that simplifies and unifies the code needed to produce a piece of analysis.

Finally, GDAL is compiled using system native code and provides a very performant framework. Unfortunately, this comes with a cost when considering specific applications of this powerful tool. Being a platform-specific framework means there is a higher risk of encountering challenges when installing GDAL. Furthermore, using java bindings for GDAL comes with additional complexities. It is not an overstatement to say that there are expectations of skill levels for GDAL users. So how can we leverage this framework to democratise geospatial data?

The benefit of this approach is that the packaged scripts are version-controlled and unit tested for the version of Mosaic. This way, we ensure consistency between installed GDAL as a dependency and the exposed functionality.

All the APIs provided by Mosaic are available in Python, SQL, R and Scala/Java; please refer to the documentation page for examples of function usage.

We have followed the same philosophy for raster data as we did for vector data. Through this shared approach, we can ensure both vector and raster data are represented in a unified way that allows for easy combinatorics between data assets.

Tessellation of a polygon in H3(8)

Now that both vector and raster data are represented in the same domain, we can easily combine data assets between vector and raster domains and produce insights. Combining data is as easy as doing a simple SQL join based on grid index cell ids.

In future releases, Mosaic will focus on expanding both the vector and raster set of APIs to simplify more use cases. In addition, we will bring custom grid index system support and user-defined functions (UDFs) for easy integration of custom shapely, rasterio and GDAL code. Finally, Mosaic will support automatic query optimization rules that can reorder your queries automatically to automatically use grid index systems if the operation can benefit from them.

This blog has covered the unification of raster and vector data under grid index systems as a unification framework. We have covered the pain points of handling frameworks like GDAL and their value and provided an approach of easy access to such valuable tools within Databricks lakehouse platform. Finally, we have provided a summary of what next is coming into Mosaic framework.

Add a comment

Related posts:

ASP.NET MVC Fundamentals

The model represents core business logic and data. Models encapsulate the properties and behavior of a domain entity and expose properties that describe the entity.Controllers typically instantiate…

Nonprofit board giving

Board giving is a frequent conversation topic. Here are the three most common issues we address with new nonprofit organizations: People serving on a board should want to give money to their…

A Game Without Rules

Have you ever wanted to play D&D so badly and it just didn’t work because everyone was busy? I don’t mean, like, busy in their lives in general, I mean, they are busy at that very moment that you…