I thought that it would be valuable to share an update of cloud-native geospatial activities in OGC, especially in light of our recent very successful Cloud-Native Outreach Event. This blog follows-up on the vision shared by OGC’s CEO, Dr. Nadine Alameh in April 2022 and two posts by OGC’s Visiting Fellow, Chris Holmes: Towards a Cloud-Native OGC and Towards a Cloud-Native Geospatial Standards Baseline.
For many years, OGC has been working on numerous aspects of the entire ecosystem of location data in cloud environments. Starting with Testbed 10 in 2013, OGC has been publishing engineering guidance on cloud topics, such as the Testbed 10 Performance of OGC Services in the Cloud: The WMS, WMTS, and WPS cases. From those earliest efforts, OGC members have recognized that our approach to enabling cloud-native geospatial capabilities must be inclusive of this whole ecosystem: formats, services, architectures, and operations. I summarized this perspective at the Outreach Event discussing Advances in OGC Cloud-Native Geospatial Activities and will further elaborate in this blog post.
The cloud ecosystem is more than just the platform in which the data lives and is operated upon, but also includes: the algorithms to process information; interfaces between both humans and machines; formats to store and retrieve information; the security regime for content and access; business operations and revenue models to sustain the environments; regulatory oversight which may impact what enters or leaves the cloud; and much, much more. “Ecosystem” is truly the correct term as you can imagine an almost 1-for-1 analoge from the cloud to a natural ecosystem.
Building an Ecosystem
The remainder of this blog digs into the elements of the ecosystem that OGC is addressing: interfaces, applications, encodings, and operations.
To start, we really cannot talk about geospatial in the cloud without also talking about the web: it is through web resources that so many users interact with cloud-hosted data and functions. OGC and the World Wide Web Consortium (W3C) collaborated in 2017 to publish the Spatial Data on the Web Best Practices as a means to illustrate how to make geospatial information more web-native. Web-native makes cloud-native more approachable. It is not enough to store data in the cloud in formats that improve access and analysis performance: we also need to develop APIs to discover, process, and extract information from the cloud and guide users to be able to work across cloud instances hosted by multiple providers. The impact of web-centric Standards modernization in OGC on enabling the cloud ecosystem cannot be overstated.
These APIs include OGC API – Features, foundational to accessing feature (vector) data as well as underpinning the STAC API specification, used for rapid discovery of remote sensing and other data. Extending the catalog paradigm, OGC API – Records allows discovery and access to all types of geospatial data as detailed as the record level. The architecture of these APIs allows developers to implement “just enough geo” to get to the data they need without becoming geospatial experts.
Many people identify the key use case for cloud-native capabilities to be the handling of massive data cubes, be those stacks of imagery or multidimensional scientific data sets. But just because you can store all of your data on the cloud does not mean that you want to use all of the data all of the time. OGC API – Environmental Data Retrieval (EDR) allows for complex subsetting of data cubes to return (or point to) just what is needed.
Do you need to fuse Internet of Things sensors with your massive content holdings? Leverage the OGC SensorThings API Standards. Consider that the combination of disparate data sources and dynamic sensors typically need some degree of processing to extract useful information, so implement OGC API – Processes to work between and within multiple data sets and feeds.
Processing comes in many models, but highly important these days is the use of Artificial Intelligence to distill vast quantities of data into useful information. OGC’s Artificial Intelligence in Geoinformatics (GeoAI) Domain Working Group is tackling some of the use cases and identifying targets for interoperability and even Standardization for information flow and quality. For example, the characterization of training and validation data used in GeoAI is now being standardized in the Training Data Markup Language for AI Standards Working Group. As part of this ecosystem, highly-automated data processing and analysis brings extraordinary benefits from cloud-native geospatial data.
The formats are also critically-important. I referenced a couple of blogs from Chris Holmes at the top of this post where there are excellent descriptions of several cloud-native encodings in wide (or soon to be wide) use. Understand that it is not just the structure of these encodings that make them “cloud-native,” but also the means by which the data are accessed (usually web-native, i.e., HTTP). Thus, many OGC-Standard encodings, such as GeoPackage, can be cloud-native. Below, I highlight several formats that are currently maturing in OGC.
OGC standardized GeoTIFF in 2019 and since that time, has been working to standardize Cloud Optimized GeoTIFF (COG) for management of raster data. Starting with the COG library, OGC has been working to document the format as a formal Standard and is nearing completion of this work. A draft specification is available as the OGC Testbed-17: Cloud Optimized GeoTIFF specification Engineering Report; the Standard won’t be too far behind.
More complex multidimensional data has proven to be efficiently encoded in the cloud using Zarr. Zarr is also in the final voting for endorsement as an OGC Community Standard. OGC’s most recently completed Testbed evaluated the suitability of Zarr for handling geospatial data cubes in the OGC Testbed 17: COG/Zarr Evaluation Engineering Report and Zarr did just fine… as did COG.
Feature (vector) data is already handled on the cloud in all types of databases that rely upon OGC Simple Features, OGC’s most widely-implemented Standard, to encode the geometry. But is this management really cloud-native, particularly with respect to streaming the data to users? Other encodings are being considered. GeoParquet is currently being incubated in OGC as a prospective cloud-native vector format. Other formats, such as FlatGeobuf are also being considered as potential Community Standards, to join existing Standards such as Indexed 3D Scene Layers and 3D Tiles, both of which offer cloud-natve capabilities, particularly with delivery of data.
Putting it together in the real world
You have read this far and see a whole bunch of references to individual Standards and specifications that address specific parts of the cloud-native geospatial ecosystem. Putting it all together requires practical application of these technologies, Standards, and specs in concert. Operation of the cloud ecosystem requires coordination of many disciplines and sometimes new architecture designs relative to our past use of monolithic systems (such as microservices and highly-composable systems). This is where the other half of the OGC is so critical. The OGC Innovation Program operates numerous initiatives each year to experiment with or pilot the capabilities listed above against real-world scenarios and deliver documentation and examples that can be re-used for many use cases.
A search of “cloud” in the Engineering Report repository returns reference to 20 documents, each highlighting practical application of the capabilities highlighted above and more. These documents can be put in the context of the cloud-native ecosystem as illustrated below.
As you can see, the Innovation Program initiatives have touched upon many aspects of the cloud ecosystem, even if only peripherally related to location technology. These Engineering Reports reference even more work of relevance and identify specific practices that are portable across many use cases. I also recommend the recent OGC Best Practice for Earth Observation Application Package, which details packaging and deployment of Earth Observation Exploitation Platforms, generally to a cloud environment.
Development and Maturation
In summary, I’ve touched upon a lot of Standards and resources and there are many more in the OGC and through our partner organizations. Each, literally EACH, of these efforts requires considerable investment in time and resources. The dedication of OGC Members to advance this work is becoming increasingly represented in the cloud ecosystem. The fact that so many major cloud service providers (e.g., AWS, Google, Microsoft, Oracle) are OGC Members highlights the relevance of OGC’s efforts in this domain.
The Standards are being matured and we have expert guidance on deployment and management of the capabilities. Expect to see dedicated developer and implementer resources from the OGC to foster consistent use of geospatial content in cloud ecosystems. We will continue to research best practices, publish guidance, and identify capabilities offered by our members to sustain the entire location industry.