Request for comment (RFC)

OGC Requests Public Comment on JSON and XML Encodings for Training Data Markup Language for Artificial Intelligence Standard

OGC TrainingDML-AI Standard standardizes any training data used to train, validate, and test Machine Learning models that involve location or time.

Request Closed: April 23, 2024 9:00 am — May 23, 2024 11:59 pm

The Open Geospatial Consortium (OGC) seeks public comment on candidate Parts 2 and 3 of the OGC Training Data Markup Language for Artificial Intelligence (TrainingDML-AI) Standard. Comments are due by May 23, 2024.

The OGC Training Data Markup Language for Artificial Intelligence (TrainingDML-AI) Part 2: JSON Encoding Standard defines requirements for encoding AI training datasets as JSON, while Part 3: XML Encoding Standard defines requirements for encoding AI training datasets as XML.

Training data plays a fundamental role in Earth Observation (EO) and general spatial Artificial Intelligence Machine Learning (AI/ML) applications, especially Deep Learning (DL). It is used to train, validate, and test AI/ML models. Understanding the source and applicability of training data allows for better understanding of the results of AI/ML operations.

To maximize the interoperability and re-usability of geospatial training data, the TrainingDML-AI Standard defines a model and encodings consistent with the OGC Standards baseline to exchange and retrieve the training data. 

The TrainingDML-AI Standard provides detailed metadata for formalizing the information model of training data. This includes but is not limited to the following aspects: 

  • How the training data is prepared, such as provenance and quality;
  • How to specify different metadata used for different ML tasks;
  • How to differentiate the high-level training data information model and extended information models specific to various ML applications;
  • How to describe the version, license, and training data size;
  • How to introduce external classification schemes and flexible means for representing ground-truth labeling.

TrainingDML-AI Parts 2 & 3 are based on the OGC Training Data Markup Language for Artificial Intelligence (TrainingDML-AI) Part 1: Conceptual Model Standard.

OGC Members interested in staying up to date on the progress of this standard, or contributing to its development, are encouraged to join the Training Data Markup Language for AI Standards Working Group via the OGC Portal. Non-OGC members who would like to know more about participating in this SWG are encouraged to contact the OGC Standards Program.

The candidate OGC Training Data Markup Language for Artificial Intelligence (TrainingDML-AI) Part 2: JSON Encoding Standard (OGC 24-006) (.DOCX) and Part 3: XML Encoding Standard  (OGC 24-007) (.DOCX) are available for review and comment on the OGC Portal. Comments are due by May 23, 2024, and should be submitted via the method below.

To Comment: 
Comments can be submitted to a dedicated email reflector for a period ending on the “Close request date” listed above. Comments received will be consolidated and reviewed by OGC members for incorporation into the document. Please submit your comments via this email address, using this Comments Template for the message body.

Subscribe to Comments:
You may wish to subscribe to the distribution list to receive comments as they are submitted. Subscribing to the list will also allow you to view comments already received, which can be found in the List Archives.