Private-use code list synthesis using XSLT

$Date: 2003/10/26 19:18:30 $(UTC)


Table of Contents

1. Introduction
2. Inputs to the stylesheet
3. Arbitrary regularly-formed XML expressions of coded values
3.1. Sample invocations in MSDOS
3.2. Sample cross-platform invocation using Apache Ant

1.  Introduction

Code list definitions in UBL are enumerations of coded values against which a UBL document can be validated. A number of code list schema fragments are shipped as standard and stock definitions in which collections of meaningful values can be found. Also included are placebo definitions that permissively accept any name token as a validated values.

Implementers and users of UBL may wish to fabricate private-use code lists to utilize in place of any of the stock or placebo code lists shipped as part of UBL. The XSLT stylesheet extcode2ubl.xsl is supplied as part of UBL 1.0, to which implementers and users can supply an arbitrary XML definition of a collection of enumerated values in order to create a usable validation fragment that can be incorporated in UBL.

Two modes of operation of the stylesheet are available: to produce either a private-use code list definition file, or a simplified non-UBL conformant code list definition file. This second option is available should it ever be necessary to produce a prototypical W3C Schema fragment from an arbitrary regularly-structured list of coded values, without needing to be UBL compliant.

2.  Inputs to the stylesheet

The stylesheet does not have a main input XML source file as all input information is communicated through command line parameters that are conveyed to the stylesheet logic. Thus, any XML document can be specified during invocation, as it will be ignored. Given that the stylesheet itself is a well-formed XML document, these examples utilize the stylesheet file as the ignored source XML document. This design decision ensures the necessarily-documented resource URI strings that must be communicated through invocation parameters are bona fide in that they are utilized as the actual resource URI values during transformation.

Three parameters are mandatory:

  • processDate= is specified for documentary purposes so that the date in which the output file has been created can be captured in the comments of the synthesized output

  • externalURL= is specified for the set of coded values expressed in XML from which the enumeration is to be gleaned, and utilized for documentary purposes for a record in the output expression

  • elementType= the element type (including namespace prefix, not namespace URI) of the externally-sourced XML-coded enumeration

    • for W3C Schema expressions, the element type is along the lines of xsd:enumeration but with the prefix matching the actual prefix used in the fragment being addressed (and note that the optional attributeName= parameter is required)

    • when the enumerated values are maintained as mixed content text in the enumeration expression, no attributeName= parameter is required

Two parameters are optional:

  • attributeName= the attribute name (including any namespace prefix, not the namespace URI) of the externally-sourced XML-coded enumeration found in the supplied elementType= parameter

    • for W3C Schema, the attribute name is value

  • placeboURL= is specified when the desired output is a UBL-conformant private-use code list

    • this parameter points to the placebo code list definition W3C Schema fragment used as a template after which the output private-use code list is modeled

    • when absent, the output code list is modeled after a generic non-UBL W3C Schema fragment that is defined as a top-level element in the stylesheet

Note again in the above how the nature of the output is based on the presence or absence of the placeboURL= parameter.

3.  Arbitrary regularly-formed XML expressions of coded values

Consider the following regularly-formed XML expression of a set of coded values using element text content in the testelem.xml file:

<test1>
  <elem>CODE1</elem>
  <elem>CODE2</elem>
  <elem>CODE3</elem>
  <elem>CODE4</elem>
  <elem>CODE5</elem>
</test1>

To have the stylesheet recognize the above five coded values, the parameter used would be: elementType=elem

Consider the following regularly-formed XML expression of a set of coded values using attribute content in the testattr.xml file:

<test>
  <elem attr="CODE1"/>
  <elem attr="CODE2"/>
  <elem attr="CODE3"/>
  <elem attr="CODE4"/>
  <elem attr="CODE5"/>
</test>

To have the stylesheet recognize the above five coded values, the parameters used would be: elementType=elem attributName=attr

3.1.  Sample invocations in MSDOS

The Saxon http://saxon.sf.net XSLT processor is used in the following examples, though any conforming XSLT processor may be used with the appropriate invocation parameter conventions.

To create a generic W3C Schema fragment from an arbitrary set of coded text values in XML elements, one would use:

saxon -o output1.xsd extcode2ubl.xsl extcode2ubl.xsl externalURL=testelem.xml 
      processDate=20031026-1910z elementType=elem

saxon -o output2.xsd extcode2ubl.xsl extcode2ubl.xsl externalURL=testattr.xml 
      processDate=20031026-1910z elementType=elem attributeName=attr

To create a private-use UBL-compliant W3C Schema fragment from an arbitrary set of coded text values in XML attributes, one would use:

saxon -o private-AccountType.xsd extcode2ubl.xsl extcode2ubl.xsl
      externalURL=testattr.xml processDate=20031026-1910z 
      elementType=elem attributeName=attr
      placeboURL=UBL-CodeList-AccountType-Placebo-Demo.xsd

To create a private-use UBL-compliant W3C Schema fragment from a publicly-available W3C Schema fragment expressing an enumeration of coded values, one would use:

saxon -o private-CurrencyCode.xsd extcode2ubl.xsl extcode2ubl.xsl
      externalURL=http://www.unece.org/etrades/unedocs/repository/codelists/xml/CurrencyCode.xsd 
      processDate=20031026-1910z elementType=enumeration attributeName=value
      placeboURL=UBL-CodeList-CurrencyCode-Placebo-Demo.xsd

3.2.  Sample cross-platform invocation using Apache Ant

The following ant task can be used to run the four tests above in a platform-independent fashion.

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE project [
<!ENTITY dateTime "20031026-1910z">
]>
<!--
     Ant task to run test examples of UBL code list synthesis

     $Id: index.xml,v 1.6 2003/10/26 19:18:30 G. Ken Holman Exp $
-->
<project default="make">
  <target name="make">
    <xslt out="output1.xsd" in="extcode2ubl.xsl" style="extcode2ubl.xsl">
      <param name="processDate" expression="&dateTime;"/>
      <param name="externalURL" expression="testelem.xml"/>
      <param name="elementType" expression="elem"/>
    </xslt>
    <xslt out="output2.xsd" in="extcode2ubl.xsl" style="extcode2ubl.xsl">
      <param name="processDate" expression="&dateTime;"/>
      <param name="externalURL" expression="testattr.xml"/>
      <param name="elementType" expression="elem"/>
      <param name="attributeName" expression="attr"/>
    </xslt>
    <xslt out="private-AccountType.xsd" 
          in="extcode2ubl.xsl" style="extcode2ubl.xsl">
      <param name="processDate" expression="&dateTime;"/>
      <param name="externalURL" expression="testattr.xml"/>
      <param name="placeboURL" 
             expression="UBL-CodeList-AccountType-Placebo-Demo.xsd"/>
      <param name="elementType" expression="elem"/>
      <param name="attributeName" expression="attr"/>
    </xslt>
    <xslt out="private-CurrencyCode.xsd" 
          in="extcode2ubl.xsl" style="extcode2ubl.xsl">
      <param name="processDate" expression="&dateTime;"/>
      <param name="externalURL"
             expression="http://www.unece.org/etrades/unedocs/repository/codelists/xml/CurrencyCode.xsd"/>
      <param name="placeboURL" 
             expression="UBL-CodeList-CurrencyCode-Placebo-Demo.xsd"/>
      <param name="elementType" expression="enumeration"/>
      <param name="attributeName" expression="value"/>
    </xslt>
  </target>
</project>