Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Round-trip safe json-ld -> yaml-ld -> json-ld #8

Closed
ioggstream opened this issue May 27, 2022 · 8 comments · Fixed by #34
Closed

Round-trip safe json-ld -> yaml-ld -> json-ld #8

ioggstream opened this issue May 27, 2022 · 8 comments · Fixed by #34
Labels
UCR Issue on Use Case/Recommendation
Milestone

Comments

@ioggstream
Copy link
Contributor

ioggstream commented May 27, 2022

As an <user with json-ld files> … WHO
I want to <convert them to yaml-ld> … WHAT
So that <they are round-trip safe> … WHY

Note

imho any other behavior hinders interoperability

@ioggstream ioggstream added the UCR Issue on Use Case/Recommendation label May 27, 2022
@nichtich
Copy link

What do you meant by round-trip safety? Given a YAML-LD document Y1 the YAML-LD specification will define a transformation to a corresponding JSON-LD document J1. The transformation will unlikely be bijective so another documents Y2 may exist being transformed to J1 as well.

I suppose transformation from JSON-LD to YAML-LD is out of the scope of YAML-LD specification anyway, isn't it? J1 could be expressed in Y1, Y2... as you like as long as these transform to J1. There may also be JSON-LD features not supported by YAML-LD (to be discussed) because they are semantically irrelevant, should these be preserved as well?

As far as I understand round-trip safety, the only way to formally tackle it is to define canonical document forms.

@gkellogg
Copy link
Member

If it's any different than JSON.parse(File.read("file.jsonld")).to_yaml or YAML.load("file.ymld").to_json then we've probably over complicated it, aside from some potential keyword transformations and magic-key insertion.

Canonical forms require the use of a canonicalization algorithm, as is defined for JSON-LD in the spec. I'm not aware of a similar algorithm for YAML, but it could likely use the same logic.

I think the way to look at round-tripping is that parsing YAML-LD or JSON-LD documents to the internal representation should produce equivalent internal representations, which leaves out canonical serialized forms, document ordering (except as required), and keyword transformations.

@ioggstream
Copy link
Contributor Author

There may also be JSON-LD features not supported by YAML-LD (to be discussed) because they are semantically irrelevant, should these be preserved as well?

Not a YAML expert here, but since YAML data types are wider than json ones,
And YAML representation graph is a direct graph potentially with cycles - while json is just a tree, I fail at identifying a json-ld feature that is not supported in YAML -ld

I think we should anyway state that yaml-ld MUST extend json-ld features.

Agree with @gkellogg :

  1. yaml-ld must support yaml.dump ( json.load ( json_text))

  2. If the YAML representation graph is acyclic, JSON.dump(YAML.load(yaml_string)) MUST be a valid json-ld equivalent to the original document modulo a well defined relation.

@anatoly-scherbakov
Copy link
Contributor

If to use the @/$ conversion (#11), and if the user defines aliases like

{
  "@context": {
    "$id": "@id"
  }
}

(idea © #9) — then, after a naive conversion to YAML-LD and then back to JSON-LD we will see

{
  "@context": {
    "@id": "@id"
  }
}

which is not even a valid JSON-LD because

keywords cannot be overridden

Knowing that, we can implement this as an edge case where $ is not replaced by @ if it is a key which is being overridden. I am thinking that we can detect such keys as direct descendants of a @context; my understanding of JSON-LD spec is not 100%. If there are any other cases, I would be happy to learn about them in the discussion for #11.

@gkellogg
Copy link
Member

This was discussed during today's call: https://json-ld.org/minutes/2022-06-22/.

gkellogg added a commit that referenced this issue Jul 2, 2022
@ioggstream ioggstream added this to the -00 milestone Jul 5, 2022
@gkellogg gkellogg mentioned this issue Jul 6, 2022
@VladimirAlexiev
Copy link
Contributor

VladimirAlexiev commented Jul 7, 2022

@ioggstream Round-trippability can be understood at different levels:

  • JSONLD-YAMLLD (always possible) and YAMLLD-JSONLD (not possible if YAML extensions are used)
  • JSONLD-RDF-YAMLLD and YAMLLD-RDF-JSONLD (always possible)

I fail at identifying a json-ld feature that is not supported in YAML-LD
we should state that yaml-ld MUST extend json-ld features.

Absolutely.
Because YAML is a super-set of JSON, we have one round-trip that I call the "default case":

  • JSONLD-YAMLLD-JSONLD

There's no question it's the most important base case, and we should agree that once and for all, and move on to discussing YAML extensions, options and fringe cases, because that's where the meat is. I.e., the default case is trivial and well-understood.

@gkellogg, does that allay your concerns, or am I under-estimating the default case?

I think the way to look at round-tripping is that parsing YAML-LD or JSON-LD documents to the internal representation should produce equivalent internal representations

Agree! Does JSON-LD define such internal representation, where is it,
Or is RDF that internal representation?

@nichtich your thoughts on bijectivity etc are relevant

I suppose transformation from JSON-LD to YAML-LD is out of the scope of YAML-LD specification anyway, isn't it?

In the contrary, it's part of the "default case" thus very important.
It's also trivial since JSON-YAML is well-known. So it can be a just a paragraph in the spec.

@anatoly-scherbakov

not even a valid JSON-LD because "keywords cannot be overridden"

That's not right for two reasons:

1: if we accept #51, one can use it to effect uniform keyword aliasing in YAML like this:

"@context":
  $id: @id
  $type: @type
  $value: @value
  # etc

This will result in JSON like this

"@context": {
  "$id": "@id",
  "$type": "@type",
  "$value": "@value"
}

That's not the degenerate form you've shown.

2: Where do you read "keywords cannot be overridden"?
In https://w3c.github.io/json-ld-syntax/#aliasing-keywords I read "Since keywords cannot be redefined, they can also not be aliased to other keywords".

  • So it's forbidden to write eg "@id": "@type"
  • But it's ok to write "@id": "@id" (not that I advocate it)

@gkellogg
Copy link
Member

gkellogg commented Jul 7, 2022

Because YAML is a super-set of JSON, we have one round-trip that I call the "default case":

  • JSONLD-YAMLLD-JSONLD

There's no question it's the most important base case, and we should agree that once and for all, and move on to discussing YAML extensions, options and fringe cases, because that's where the meat is. I.e., the default case is trivial and well-understood.

@gkellogg, does that allay your concerns, or am I under-estimating the default case?

That's pretty much my view.

I think the way to look at round-tripping is that parsing YAML-LD or JSON-LD documents to the internal representation should produce equivalent internal representations

Agree! Does JSON-LD define such internal representation, where is it, Or is RDF that internal representation?

JSON-LD defines the internal representation. While it could potentially be extended (e.g., different types of numbers), we'd need to carefully justify doing so.

@nichtich your thoughts on bijectivity etc are relevant

I suppose transformation from JSON-LD to YAML-LD is out of the scope of YAML-LD specification anyway, isn't it?

I would not say so. My view is the round-tripping means that there is no semantic loss in turning JSON-LD -> YAML-LD -> JSON-LD or visa-versa. Trying to get back to exact syntactic forms is a needless complication. Having both go through RDF would also be acceptable, and generating the results will of necessity involve applying contexts (compact form) or frames (framed form). It may be possible to reproduce the embedding structure without the use of framing in either direction that doesn't involve flattening or toRDF, but this is a nice-to-have artifact, not a requirement.

@anatoly-scherbakov

not even a valid JSON-LD because "keywords cannot be overridden"

That's not right for two reasons:

1: if we accept #51, one can use it to effect uniform keyword aliasing in YAML like this:

"@context":
  $id: @id
  $type: @type
  $value: @value
  # etc

This will result in JSON like this

"@context": {
  "$id": "@id",
  "$type": "@type",
  "$value": "@value"
}

That's not the degenerate form you've shown.

Yes. But, we can go overboard with trying to encourage the use of $ keywords when in many cases, using plain-word versions is preferable, ('e.g', id, type, ...).

2: Where do you read "keywords cannot be overridden"? In https://w3c.github.io/json-ld-syntax/#aliasing-keywords I read "Since keywords cannot be redefined, they can also not be aliased to other keywords".

  • So it's forbidden to write eg "@id": "@type"
  • But it's ok to write "@id": "@id" (not that I advocate it)

There are actually reasons for doing things like this, e.g., "@type": {"@id": "@type", "@container", "@set"}.

@gkellogg
Copy link
Member

This issue was discussed in today's meeting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
UCR Issue on Use Case/Recommendation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants