Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FAQ on what information modeling languages will OSIM use #14

Open
sparrell opened this issue Jun 6, 2024 · 9 comments
Open

FAQ on what information modeling languages will OSIM use #14

sparrell opened this issue Jun 6, 2024 · 9 comments

Comments

@sparrell
Copy link
Contributor

sparrell commented Jun 6, 2024

FAQ on what information modeling languages will OSIM use.

I propose we allow UML, ASN.1, and JADN, and potentially any other standard information modelling language TC Members propose.
I propose we not "pick winners" but work in whatever TC Members want to contribute.

@aj-stein-nist
Copy link

Hi, first time caller, long-time listener. I am currently also an observer on the TC. If we are considering ASN.1 a modeling language in scope due to the related technologies in the charter, you may want to include CBOR and CDDL given their heavy usage in SCITT and other IETF technologies that are normative requirements in SCITT.

@davaya
Copy link
Contributor

davaya commented Jun 7, 2024

I whipped up a "modeling taxonomy" slide as a starting point for discussion at https://docs.google.com/presentation/d/1aVvlIHd8POlX6yVdo-X-60Y_Z0R089sAOyMrIWwTyMI/. It has four layers:

  1. Ontologies / Knowledge Graphs
  2. Information Models including Abstract Schemas
  3. Data Models / Concrete Schemas
  4. Data Values

I agree that CBOR (Concise Binary Object Representation) data format and its CDDL (Concise Data Definition Language) data model are an essential part of the use cases to be considered by this TC because unlike XML and JSON, it is a binary data format with conciseness prioritized above human readability of the data. (CBOR "annotated hex" notation looks like assembly language, with the transmitted bytes front and center and the human readable meaning of those bytes generated as annotations.) CycloneDX SBOMs can be serialized in another binary format, Google Protobuf, that is important for the same reason.

(The CBOR example in https://www.w3.org/TR/did-cbor-representation/#example-2-did-document-encoded-as-cbor-diagnostic-notation) is mislabeled: "Diagnostic Notation" looks similar to JSON, while the example shown is actually in "Annotated Hex" format).

The taxonomy shows JADN as an information model language that can in principle both generate CDDL schemas and directly validate CBOR data. Metaschema and ASN.1 also fall into the information modeling layer.

@aj-stein-nist
Copy link

I whipped up a "modeling taxonomy" slide as a starting point for discussion at https://docs.google.com/presentation/d/1aVvlIHd8POlX6yVdo-X-60Y_Z0R089sAOyMrIWwTyMI/.

In this slide, the ontology technologies and forming the ontologies/taxonomies in scope for OSIM TC? Or that is where other work begins and pick up the lower level inputs from the work that is in scope for this TC?

Thanks for the slide, that is helpful and constructive.

@davaya
Copy link
Contributor

davaya commented Jun 8, 2024

The charter says "The OASIS Open Supplychain Information Modeling (OSIM) TC aims to standardize and promote information models about all aspects of supply chains."

Information modeling is a design approach for data and systems. So standardizing the data used in information modeling is the scope of the TC, but the evaluation criterial include the ease with which IM integrates with and enhances existing data and design approaches. There's no reason to pursue IM if it doesn't make other work easier, better, or both.

@aj-stein-nist
Copy link

aj-stein-nist commented Jun 8, 2024

Information modeling is a design approach for data and systems. So standardizing the data used in information modeling is the scope of the TC, but the evaluation criterial include the ease with which IM integrates with and enhances existing data and design approaches. There's no reason to pursue IM if it doesn't make other work easier, better, or both.

Makes sense. So just so I understand: evaluating integration with existing ontologies or taxonomies is in scope, but likely not creating them, that's beyond the scope of information modeling?

Either way the TC may want to evaluate the Cyber Domain Ontology as an integration point. I have reviewed but not used or directly contributed yet. I have not see many other RDF ontologies for cyber information at its breadth and age. (Full disclosure: one of my colleagues is a maintainer.)

@davaya
Copy link
Contributor

davaya commented Jun 11, 2024

Yes. An information model defines data (documents, messages, datatypes) used as resources (subjects and objects) in an ontology. The ontology defines relationships (predicates) between subjects and objects. In RDF terms, an IM defines a lexical-to-value mapping, except that the lexical space is not restricted to strings.

Consider resources like an IP packet or an image (if gif, jpg, and png hadn't already been invented). An IM would define RGBA pixels, pixel rows, and images consisting of metadata, palettes, and rows. An ontology would define the relationships between images and other resources.

@aj-stein-nist
Copy link

Yes. An information model defines data (documents, messages, datatypes) used as resources (subjects and objects) in an ontology. The ontology defines relationships (predicates) between subjects and objects. In RDF terms, an IM defines a lexical-to-value mapping, except that the lexical space is not restricted to strings.

OK, so this comment helps in GitHub, how one or more information models, and perhaps the derived ontology, is unclear in the charter that I believe has been voted and approved. As an observer (considering upgrade to a member), it would help to understand whether or not I should dedicate some or significant effort if our goals aligned. (I only point this out because I am not asking questions just to be curious or difficult, if I consider that the spectrum. I am interested in rolling up sleeves.)

Consider resources like an IP packet or an image (if gif, jpg, and png hadn't already been invented). An IM would define RGBA pixels, pixel rows, and images consisting of metadata, palettes, and rows. An ontology would define the relationships between images and other resources.

This example is helpful, thank you. Again, if this example is relevant of a small or significant part of the TC's scope in future work, I would reiterate you should look into CDO.

And I understand these comments are going beyond relevance to just this FAQ issue. If I should discuss this questions and some others I have, let me know how I can pose those if not in GitHub issues. I know the charter isn't published here (yet?), so I want to meet the group where they are.

@davaya
Copy link
Contributor

davaya commented Jun 15, 2024

SHACL constraints define data structure whether in the context of data schemas (XSD), information models, or ontologies, so the structures defined in CDO certainly have analogs as information types. I'm speaking from the IM technology perspective, but I assume that CDO is also relevant and in-scope from the Supplychain use case perspective.

IMs are data format agnostic, so some of the peculiarities of XML are abstracted away:

  • every type is a datatype, there's no distinction between literals and objects
  • there is a distinction between single values and arrays, so when maxCount=1 instances are not arrays containing one value, they are just values.
  • an IM doesn't have or need property definitions - a structure uses property names for types, and different structures could use a given type with the same or different property names.

But here are RDF triples for an example CDO type "Action": the fact that they define a set of properties means there is an analogous Action type in an IM.
image

@aj-stein-nist
Copy link

Thanks for the additional context. So with that in mind, the TC intends to take existing IMs and potentially build a superset IM for OSIM over them?

I ask here for how it pertains to the scope and charter and how that is expressed in the FAQ. I'm not saying I think CDO should or must be used but your replies here have told me more about the scope than other documents and information in this repo than I understood prior.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants