-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deserialization for elements and attributes with ":" in name #64
Comments
Could you please provide more details? |
https://raw.githubusercontent.com/HUPO-PSI/mzML/master/schema/schema_1.1/mzML1.1.0.xsd The xml: and xs: in tag and attribute names would not parse (using Serde rename to set the exact name for a Rust struct member). After renaming with _ instead of :, I was able to parse it. |
I mean, details on how do you see this working on the Rust side? Obviously |
In the old xml parser I simply stripped such prefixes. I have no idea what they even do, and if everyone just ignores them anyway, I don't see a reason to keep them. |
Yeah, that's what I'm doing too. |
@oli-obk While you're here - could you please reply to serde-deprecated/xml#35 (comment)? I left it a while ago but still don't know if it's desirable :) |
@RReverser I'm accustomed to serde libraries using #[serde(rename)] for such cases rather than throwing out part of the identifier. That's what I've generally done with csv files for sure, but I think I've had the problem with json as well. A bit of googling shows that part of the identifier is used for namespaces, so beyond being counterintuitive (at least to me) it seems like this would lead to potential name collisions and prevent validation (items with a wrong or nonexistent namespace could not get distinguished from those in the expected nanespace) @dtolnay may have a better overview of what libraries tend to do however |
I've encountered a problem with wordpress xml which has
|
I did a workaround for now: pub encoded: Vec<String>, // encoded[0] is `content:encoded` Though it's not very reliable. |
Hello, stumbled on this issue. Another way to reproduce is also an XML such as:
so no suggested workaround works (using serde Can you advice on the suggested way to cope with this without touching the source XML? Sadly, I'm already thinking to pre-process the XML as suggested by @spease or dropping this crate and directly XML parsing (f.e. with Thank you in advance for any hint (I just tried this library, so I may have missed something). EDIT: on a second thought, this seems to work
and ignore the other |
The problem is that this library has very limited support for namespaces. The deserializer will ignore the namespace. The serializer is currently incapable of generating a document using namespaces. @apiraino, are you sure that the |
hey @punkstarman thanks for the reply. Damn, you're right the field is set to a default empty string ( I was confused by too many fields). Besides the lack of support for namespaces (which is a feature), the real issue I see is that the parser panics when a tag with namespace is found. is there a way to avoid this? I'd avoid to pre-process the XML. I hope that this library lifecycle will move forward, it's actually the only good option to work on XML files the way we're used with serde. Thanks for working on this library! |
The parser doesn't panic when it encounters a tag with namespace. It just lops off the namespace part and produces a field with the remainder. The parser panics when it tries to fit two XML elements with the same name into a single Rust struct field that is of collection type. |
This seems like it should be an error rather than a panic. An application could recover from it. |
@spease , panic was a poor choice of words. It is in fact an error (for example see #64 (comment)). |
I am trying to parse an RSS feed and the part that is relevant looks like this <link>...</link>
<atom:link href="..." rel="self" type="application/rss+xml"/> I want to get the value of The thread doesn't seem to have a concrete solution for this but posting anyway in case someone came up with one and just didn't reply. I tried setting Is there perhaps a different workaround for this since the Any news on this? |
This is a considerable problem when trying to parse DTDs. Replacing the colons with underscores allows for parsing otherwise.
The text was updated successfully, but these errors were encountered: