Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XMLs in Dataspecer #839

Open
sstenchlak opened this issue Jan 6, 2025 · 0 comments · May be fixed by #845
Open

XMLs in Dataspecer #839

sstenchlak opened this issue Jan 6, 2025 · 0 comments · May be fixed by #845
Assignees
Labels
discussion psm-editor UI for editing PSM schemas xml-generators XML generators (XSD, XSLT)

Comments

@sstenchlak
Copy link
Member

The current structure of PSM in Dataspecer is too confusing and, in some cases, insufficient for generating XSD schemas. This issue aims to summarize what needs to be changed.

Currently, there is a need to use generated XSD schemas in two ways: (i) for direct validation of XML documents, which requires defining at least one root xsd:element, and (ii) for referencing complexType definitions from another XSD schema, which requires defining at least one root xsd:complexType.

The main idea of the structural model (PSM) in Dataspecer is based on a tree structure where nodes represent classes, and edges represent associations.

Classes correspond to complexType, while associations and attributes map to elements. Since the root of the PSM tree is a class (therefore a complexType), in case (i), it is necessary to specify the name of the root element that will use the given complexType. In some cases, the element name can match the type name, but, generally, we want to allow the user to specify a custom name.

Additionally, the cardinality of the root element must be defined. While this can technically be achieved directly in PSM, there is a consensus that the root of the PSM should represent the class being described, not helper constructs to achieve different cardinalities and metadata objects.

In cases of 1..1 cardinality, it is sufficient to use the mentioned root element directly. For other cardinalities, an additional element needs to be specified as the actual root, containing the individual elements that can have arbitrary cardinalities (typically 0..*).
This approach is similar to JSON Schema, where the cardinality of the root element is also specified, and the options are: using a single object as the root (1..1), using an array as the root (0..*), or wrapping the array in an object with a single property, which can be customized by the user.

It is also clear that OR constructs, which operate at the class level, are misused in the XML context, leading to confusion. Consider the following example:

Schema describing a vehicle. A vehicle has a registration number, type, and owner. The owner is either a physical person or a legal entity. If the owner is a physical person, they have a first name and last name. If it is a legal entity, it has a company name.

What we expect in XML context is the following schema

<xs:complexType name="VehicleType">
    <xs:sequence>
        <xs:element name="registrationNumber" type="xs:string"/>
        <xs:element name="type" type="xs:string"/>
        <xs:element name="owner" type="OwnerType"/>
    </xs:sequence>
</xs:complexType>

<xs:complexType name="OwnerType">
    <xs:choice>
        <xs:element name="PhysicalPerson" type="PhysicalPersonType"/>
        <xs:element name="LegalPerson" type="LegalPersonType"/>
    </xs:choice>
</xs:complexType>

<xs:complexType name="PhysicalPersonType">
    <xs:sequence>
        <xs:element name="firstName" type="xs:string"/>
        <xs:element name="lastName" type="xs:string"/>
    </xs:sequence>
</xs:complexType>

<xs:complexType name="LegalPersonType">
    <xs:sequence>
        <xs:element name="companyName" type="xs:string"/>
    </xs:sequence>
</xs:complexType>

Notice that, to distinguish between the types of persons, an additional element is used here — either <PhysicalPerson> or <LegalPerson>. This differs from what I would expect with our definition of OR, which should operate at the content level of complexType. Ideally, it would resemble something like this:

<xs:complexType name="OwnerType">
    <xs:choice>
        <xs:group ref="PhysicalPersonGroup"/>
        <xs:group ref="LegalPersonGroup"/>
    </xs:choice>
</xs:complexType>

<xs:group name="PhysicalPersonGroup">
    <xs:sequence>
        <xs:element name="firstName" type="xs:string"/>
        <xs:element name="lastName" type="xs:string"/>
    </xs:sequence>
</xs:group>

<xs:group name="LegalPersonGroup">
    <xs:sequence>
        <xs:element name="companyName" type="xs:string"/>
    </xs:sequence>
</xs:group>

Containers of type choice and sequence will therefore be introduced, which can be applied at the level of attributes and associations. We would also need non-interpreted classes and associations to create wrapping elements.

In general, we need to implement following features

  • Set root XML element name.
  • Set root class cardinality and wrapping XML element name.
  • Use non-interpreted class as a root.
  • Add non-interpreted association to non-interpreted classes.
  • Use containers such as sequence and choice.
@sstenchlak sstenchlak added discussion xml-generators XML generators (XSD, XSLT) psm-editor UI for editing PSM schemas labels Jan 6, 2025
@sstenchlak sstenchlak linked a pull request Jan 8, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion psm-editor UI for editing PSM schemas xml-generators XML generators (XSD, XSLT)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants