uxsdcxx is a tool to generate PugiXML-based C++ reader, validator and writer from an XSD schema. It can generate code for a subset of XSD 1.0.
It currently supports:
- Simple types with following exceptions:
xs:list
s are just read into a string.- Only enumerations are supported as
xs:restriction
s of simple types. - Restricted string types such as
IDREF
,NCName
etc. aren't validated.
- Complex types.
- Model groups(all, sequence and choice)
- Elements.
- Attributes except
xs:anyAttribute
.- Default values are supported.
- When writing, non-zero default values are always written out.
It currently does not support:
- Anything that PugiXML can't read:
- XML namespaces
pip install uxsdcxx
. Use with uxsdcxx.py foo.xsd
. Two files foo_uxsdcxx.h
and foo_uxsdcxx.cpp
will be created.
All uxsdcxx functions live in a namespace uxsd
.
For every root element in the schema, a class is generated with load
and write
functions. For instance,
<xs:element name="foo">
<xs:complexType>
...
</xs:complexType>
</xs:element>
results in this C++ code:
class foo : public t_foo {
public:
pugi::xml_parse_result load(std::istream &is);
void write(std::ostream &os);
};
load()
loads from an input stream into this root element's structs and write()
writes its content to a given output stream.
Note that root elements with simple types are not supported.
uxsdcxx generates global pools to store multiply-occurring types.
<xs:complexType name="foo">
<xs:sequence>
<xs:element name="bar" type="bar" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
generates:
extern std::vector<t_bar> bar_pool;
[...]
struct foo {
collapsed_vec<t_bar, bar_pool> bars;
};
A collapsed_vec
is a size and an offset pointing into a pool. It provides contiguous memory while being able to store an unbounded number of elements. The main limitation of a collapsed_vec
is that it's insertable only when its end points to the end of the pool.
Strings constitute a special case: a char_pool
is generated for them to prevent many small allocations.
<xs:complexType name="foo">
<xs:sequence>
<xs:element name="bar" type="string" maxOccurs="unbounded"/>
</xs:sequence>
</xs:complexType>
generates:
extern char_pool_impl char_pool;
struct foo {
const char * bar;
};
The pools are freed by using utility functions uxsd::clear_pools()
and uxsd::clear_strings()
. clear_strings
is provided separately since it can be useful to keep the strings around after freeing the generated structures.
You can find the generated types for your schema in output header file foo_uxsdcxx.h
. The mapping rules of XSD types to C++ types are such:
-
<xs:complexType>
definitions correspond to C++ structst_{name}
. For complexTypes in global scope,name
refers to thename
attribute of the type. For complexTypes defined inside elements,name
refers to thename
attribute of the parent element.- An
<xs:attribute>
generates a struct field with a C++ type corresponding to its<xs:simpleType>
as defined below. - A model group such as
<xs:choice>
,<xs:sequence>
or<xs:all>
generates struct fields with C++ types corresponding to the types of the elements inside.- If an element can occur more than once, a
collapsed_vec<T, T_pool>
is generated. - If an element can occur zero times, another field
bool has_T
is generated to indicate whether the element is found.
- If an element can occur more than once, a
- An
-
<xs:simpleType>
can take many forms. -
<xs:union>
corresponds to a tagged union type, such as:
struct union_foo {
type_tag tag;
union {
double as_double;
int as_int;
};
};
<xs:list>
generates aconst char *
.- Atomic builtins, such as
xs:string
orxs:int
generate a field of the corresponding C++ type(const char *
,int
...) <xs:restriction>
s of simple types are not supported, except one case where an<xs:string>
is restricted to<xs:enumeration>
values. C++ enums are generated for such constructs. As an example, the following XSD:
<xs:simpleType name="filler">
<xs:restriction base="xs:string">
<xs:enumeration value="FOO"/>
<xs:enumeration value="BAR"/>
<xs:enumeration value="BAZ"/>
</xs:restriction>
</xs:simpleType>
generates a C++ enum:
enum class enum_filler {UXSD_INVALID = 0, FOO, BAR, BAZ};