Deserialize designspace lib fields #311

cmyr · 2023-07-25T18:51:10Z

This was about as painful as expected, and requires us to manually glue together the xml data model with the plist model, using serde.

To briefly summarize the problem: XML data, by definition, does not have any defined semantics. Semantics are added on top of XML on a per-case basis, by indicating the document type inline
(https://en.wikipedia.org/wiki/Document_type_declaration).

The plist crate does implement derive for plist types, but they are based on the assumption that type information is being provided by some deserializer (e.g. for a specific file format). This is the right thing for the plist crate to do. Unfortunately quick_xml cannot provide this type information, since quick_xml does not know that we are deserializing a plist.

To make this all work, I have added a bunch of manual deserialize code. This code will only work when deserializing from XML; if in the future we wish to serialize to other formats, we will need to revise it.

Some notes:

This only implements deserialization, not serialization. (This is fine, since we do not currently serialize anyway, but we'll have to figure that out eventually)
this does not support the UID type, which is not accepted in XML plists

RickyDaMa · 2023-07-26T09:11:45Z

Wow this is insane, thanks for implementing this! I'll do a more thorough review when I have time (Friday or next week). I'll probably end up implementing serialisation as part of #309

cmyr · 2023-07-26T14:22:05Z

yea, serialization will likely be annoying too, but I'm not sure exactly how annoying. I guess we'll find out? 😬

This was about as painful as expected, and requires us to manually glue together the xml data model with the plist model, using serde. To briefly summarize the problem: XML data, by definition, does not have any defined semantics. Semantics are added on top of XML on a per-case basis, by indicating the document type inline (<https://en.wikipedia.org/wiki/Document_type_declaration>). The plist crate does implement derive for plist types, but they are based on the assumption that type information is being provided by some deserializer (e.g. for a specific file format). This is the right thing for the plist crate to do. Unfortunately quick_xml cannot provide this type information, since quick_xml does not know that we are deserializing a plist. To make this all work, I have added a bunch of manual deserialize code. This code will only work when deserializing from XML; if in the future we wish to serialize to other formats, we will need to revise it. Some notes: - This only implements deserialization, not serialization. (This is fine, since we do not currently serialize anyway, but we'll have to figure that out eventually) - this does not support the UID type, which is not accepted in XML plists

RickyDaMa

That was an insane read, thanks for going to all the effort here of making this!

I guess you're making the trade-off of being more efficient at the cost of having to re-implement the whole plist spec, whereas my approach was trying to find some middle ground to not re-implement the wheel, at the cost of intermediary allocations.

I still don't fully understand serde's visitor pattern to be able to read this super-critically, but since your tests pass then I'm happy to start making use of this 😄

RickyDaMa · 2023-07-28T10:19:00Z

src/serde_xml_plist.rs

+            //FIXME: remove this + base64 dep when/if we merge
+            //<https://github.com/ebarnard/rust-plist/pull/122>
+            let b64_str = map.next_value::<&str>()?;
+            base64_standard
+                .decode(b64_str)
+                .map(Value::Data)
+                .map_err(|e| A::Error::custom(format!("Invalid XML data: '{e}'")))
+        }


It's been merged 🚀 Just need to see if ebarnard releases a new version onto crates.io for us

cmyr · 2023-07-28T14:46:59Z

Huh interesting, I honestly kind of appreciate the compactness of your version, maybe we should take that instead? I'm not even sure there's much performance difference to be honest. Want to PR it, maybe including the tests from this branch?

cmyr · 2023-07-28T14:47:54Z

Although thinking about it a bit more, you are going to need more logic to handle date/data/integer, at least.

edit:

Actually now that I think about it I'm not sure your version is working as expected. Maybe try moving over the one big catch-all test I have, and see what happens?

RickyDaMa · 2023-07-28T16:15:29Z

Yeah mine wasn't working fully, I stopped investing too much time in it and reached out in the Google Chat, at which point you went crazy 😁 The main hitch I'm having is nested dictionaries, I'm getting cryptic serde errors no matter what I try it feels like. If you think it's worth trying to bring "up to spec" I can keep working on it.

Ultimately it's a trade-off of brevity/clarity for efficiency

cmyr · 2023-07-28T17:04:21Z

Thinking about this more I suspect that your approach just may not be tractable, because XML alone doesn't provide enough structural information to identify the appropriate types? I'm not positive, but if we're happy enough with my version then I'm okay with merging this as-is.

cmyr · 2023-07-28T17:11:53Z

Okay I've talked myself into thinking this is the right approach, will merge and then you can use this to figure out how to save? Happy to help if you get stuck.

cmyr force-pushed the designspace-lib branch 2 times, most recently from 0be3e8a to a0941b9 Compare July 25, 2023 19:21

cmyr force-pushed the designspace-lib branch from a0941b9 to 5017762 Compare July 26, 2023 14:20

cmyr force-pushed the designspace-lib branch from 5017762 to 0fc1096 Compare July 26, 2023 15:41

RickyDaMa approved these changes Jul 28, 2023

View reviewed changes

RickyDaMa reviewed Jul 28, 2023

View reviewed changes

cmyr merged commit 081b73d into master Jul 28, 2023
4 checks passed

cmyr deleted the designspace-lib branch July 28, 2023 17:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deserialize designspace lib fields #311

Deserialize designspace lib fields #311

cmyr commented Jul 25, 2023

RickyDaMa commented Jul 26, 2023

cmyr commented Jul 26, 2023

RickyDaMa left a comment

RickyDaMa Jul 28, 2023

cmyr commented Jul 28, 2023

cmyr commented Jul 28, 2023 •

edited

Loading

RickyDaMa commented Jul 28, 2023

cmyr commented Jul 28, 2023

cmyr commented Jul 28, 2023

Deserialize designspace lib fields #311

Deserialize designspace lib fields #311

Conversation

cmyr commented Jul 25, 2023

RickyDaMa commented Jul 26, 2023

cmyr commented Jul 26, 2023

RickyDaMa left a comment

Choose a reason for hiding this comment

RickyDaMa Jul 28, 2023

Choose a reason for hiding this comment

cmyr commented Jul 28, 2023

cmyr commented Jul 28, 2023 • edited Loading

RickyDaMa commented Jul 28, 2023

cmyr commented Jul 28, 2023

cmyr commented Jul 28, 2023

cmyr commented Jul 28, 2023 •

edited

Loading