Contents
Does .NET speak the full XML language?
Top 
Part 2, XSD schemas
Top 
Introduction
W3C (the World Wide Web Consortium, http://www.w3.org [^]) published the XML 1.0 specification on February 10th 1998. The XML 1.1 specification got published six years later, on February 4th 2004. In the six years XML has taken the industry by storm. XML has become the standard how to describe and exchange data. The current development platforms .NET and J2EE support XML natively. All modern enterprise applications, may it be a SQL Server or Oracle database, a BizTalk Server, an Office suite or any of the other thousands of applications support XML to various degrees. You will be pretty hard pressed to find an application which does not support or use XML.
There is more to XML then just a way of describing data. Over the years a number of XML based standards emerged. The most fundamental ones being XSD (XML Schema Definition), XPath Query, XSLT (Extensible Stylesheet Language Transformation), SOAP (Simple Object Access Protocol), WSDL (Web Services Description Language) and others. All build on top of the XML syntax. This will be a series of articles. The first three articles explain the fundamentals of XPath queries, XSD schemas and XSLT. The following articles look at how well these standards are supported by the .NET framework and what are the most important namespaces and types. This article is not intended as a comprehensive description of all the .NET types around XML. The goal is rather to provide a good introduction so you understand the XML capabilities of the .NET framework and can start leveraging them for your current .NET projects.
Top 
The sample XML document for the series of articles
This series of articles takes it as a given that you are familiar with XML itself. The sample XML document used throughout the articles is a list of employees, which must have for each employee the first name, last name, phone number and email address and can also provide the job title and a web address.
<?xml version="1.0" encoding="utf-8"?>
<Employees xmlns="http://tempuri.org/MySchema.xsd">
<Employee ID="1">
<FirstName>Klaus</FirstName>
<LastName>Salchner</LastName>
<PhoneNumber>410-727-5112</PhoneNumber>
<EmailAddress>klaus_salchner@hotmail.com</EmailAddress>
<WebAddress>http://www.enterprise-minds.com</WebAddress>
<JobTitle>Sr. Enterprise Architect</JobTitle>
</Employee>
<Employee ID="2">
<FirstName>Peter</FirstName>
< LastName>Pan</LastName>
<PhoneNumber>604-111-1111</PhoneNumber>
<EmailAddress>peter.pan@fiction.com</EmailAddress>
<JobTitle>Sr. Developer</JobTitle>
</Employee>
</Employees>
Top 
The fundamentals of XSD schemas
It is very easy to create XML documents whether programmatically or manually through an XML editor like XML Spy, Stylus Studio or Visual Studio .NET 2003. But very often when processing a XML document you want to know that it conforms to a certain structure, the structure your application understands. That is where XSD schemas come into play. XSD schemas are the successor of DTD's (Document Type Definition), the difference being that XSD itself uses a XML syntax. XSD schemas allow you to declare the structure of an XML document, which elements and attributes are allowed, is it a mandatory or optional element, can there be more then one instance of an element, etc. You can then use the XSD schema to validate the XML document, meaning does the XML document conform to the structure described by the XSD schema. The XML describes the data and the XSD schema describes the structure of the data. Version 1.0 of the XSD schema standard has been released May 2001 and can be found at http://www.w3.org/TR/xmlschema-0/ [^], http://www.w3.org/TR/xmlschema-1/ [^] along with http://www.w3.org/TR/xmlschema-2/ [^]. The working draft of XSD 1.1 can be found at http://www.w3.org/TR/2003/WD-xmlschema-11-req-20030121/ [^].
When you created your XSD schema you do two things. First, you declare an element or attribute. Declaring means you associate an element or attribute name with a set of constraints, for example an element with the name FirstName is of the type string and only one element of that name is allowed. Secondly, you define new simple or complex types. XSD has a set of standard types like string, boolean, integer, date, etc. The .NET framework maps these XSD data types against its .NET data types. In our sample XML document the Employee is a complex type. Think in terms of data structures. In your application code you would define a new structure called Employee and it would contain the elements FirstName, LastName, PhoneNumber, EmailAddress, WebAddress and JobTitle. In XSD schemas you do exactly the same. You define a complex type of the name Employee and then declare all the elements this type has plus for each element the constraints, for example the FirstName element is of the type string. See below the XSD schema for our sample XML document:
<?xml version="1.0"?>
<xs:schema targetNamespace="http://tempuri.org/MySchema.xsd [^]"
xmlns="http://tempuri.org/MySchema.xsd [^]"
xmlns:xs="http://www.w3.org/2001/XMLSchema [^]"
attributeFormDefault="unqualified"
elementFormDefault="unqualified">
<xs:element name="Employees">
<xs:complexType>
<xs:choice minOccurs="1" maxOccurs="unbounded">
<xs:element name="Employee" type="EmployeeType"/>
</xs:choice>
</xs:complexType>
</xs:element>
<xs:complexType name="EmployeeType">
<xs:sequence>
<xs:element name="FirstName" type="xs:string" minOccurs="1" maxOccurs="1"/>
<xs:element name="LastName" type="xs:string" minOccurs="1" maxOccurs="1"/>
<xs:element name="PhoneNumber" type="xs:string" minOccurs="1" maxOccurs="1"/>
<xs:element name="EmailAddress" type="xs:string" minOccurs="1" maxOccurs="1"/>
<xs:element name="WebAddress" type="xs:string" minOccurs="0" maxOccurs="1"/>
<xs:element name="JobTitle" type="xs:string" minOccurs="0" maxOccurs="1"/>
</xs:sequence>
<xs:attribute name="ID" form="unqualified" type="xs:string"/>
</xs:complexType>
</xs:schema>
Let's first look at the XSD elements, meaning the XML elements you use in your XSD schema, which you use to declare an element or attribute. W3C provides a XSD schema which describes all the valid XSD element and attribute names. It can be found at http://www.w3.org/2001/XMLSchema.xsd.
| |
Description |
| element |
Used to declare an element. Can have any of the attributes listed below to describe the element you are declaring. |
| attribute |
Used to declare an attribute. Can have any of the attributes listed below to describe the attribute you are declaring, except otherwise specified. |
| name (attribute) |
Specifies the name of the XML element or attribute. |
| type (attribute) |
Specifies the type of the XML element or attribute. XSD comes with a number of simple data types like string, integer, date, etc. Each .NET data type can be mapped to a XSD data type. Refer to your MSDN library for a complete list of the XSD types (search for "XML Data Types Reference"; make sure to put it in double quotes so it searches for the whole term not just the individual words) |
| minOccurs (attribute) |
Describes the minimum number of occurrences of the element (not allowed for attributes). A value of zero means that you can omit this element. Any other value means you must have this element that often, for example one time. This allows you to make elements mandatory. |
| maxOccurs (attribute) |
Describes the number maximum number of occurrences of the element (not allowed for attributes). Setting this value to zero un-declares the element, meaning no element of this name is allowed. Setting it to the value "unbounded" means an unlimited number of elements is allowed. Specifying a value means the element is not allowed to be present more often then specified. |
| default (attribute) |
Specifies the default value of the element or attribute. This can only be used for simple data types or text only data types. The "default" and "fixed" attributes are mutually exclusive. |
| fixed (attribute) |
Specifies the predetermined and unchangeably value of an element or attribute. This can only be used for simple data types or text only data types. The "default" and "fixed" attributes are mutually exclusive. |
| ref (attribute) |
References a global element or attribute declared someplace else in this or any other referenced XSD schema. This allows to declare another instance of that element or attribute under a complex type without having to repeat all the constraints (meaning the type, name, minOccurs, maxOccurs, and so on). It does not allow to reference another element or attribute when part of another complex type, only global ones. |
| form (attribute) |
If set to "unqualified" then this element or attribute is not required to be qualified with a namespace prefix. If set to "qualified" then this element or attribute must be qualified with a namespace prefix. If not specified then the default from the schema element applies (elementFormDefault and attributeFormDefault). |
This is not a complete list but these are the main XSD elements you use to declare elements or attributes. Refer to the XSD standard for a complete reference. Now let's look at the XSD elements you use to define new types. You can define simple types and complex types. A simple type takes a base type and applies some restrictions to it.
| |
Description |
| simpleType |
Defines a simple type, which takes a base type and applies additional restrictions to it. A simple type can not declare any elements or attributes. It takes the base type and applies new restrictions. Here is an example:
<xs:element name="MyValue" type="MyInteger"/> <xs:simpleType name="MyInteger"> <xs:restriction base="xs:positiveInteger"> <xs:minInclusive value="1"/> <xs:maxInclusive value="10"/> </xs:restriction> </xs:simpleType>
It declares a new element of the name MyValue which is of the type MyInteger. It then defines the new type called MyInteger which uses as base type a positiveInteger and restricts its values between one and ten (inclusive). If you don't want to define a new type then you can nest it within the element you declare:
<xs:element name="MyValue"> <xs:simpleType> <xs:restriction base="xs:positiveInteger"> <xs:minInclusive value="1"/> <xs:maxInclusive value="10"/> </xs:restriction> </ xs:simpleType> </xs:element>
As you can see in this case you don't specify a type attribute but rather nested within the element have the simple type defined. And also the simple type does not get any name attribute, so it can't be used for any other element. The same applies for attributes you declare. |
| restriction |
Defines a restriction within a simple type. With the base attribute you specify the base type this simple type is based on, e.g. a positiveInteger. It by default inherits then all the restriction of that base type. See example above. |
| maxInclusive |
The maximum value the type allows, including the value you specify. So this translates to "less then or equal". |
| maxExclusive |
The maximum value the type allows, excluding the value you specify. So this translates to "less then". |
| minInclusive |
The minimum value the type allows, including the value you specify. So this translates to "greater then or equal". |
| minExclusive |
The minimum value the type allows, excluding the value you specify. So this translates to "greater then". |
| maxLength |
The maximum length of the type (less then or equal). |
| minLength |
The minimum length of the type (greater then or equal). |
The restrictions element allows a number of restrictions to apply. The list above shows the most common XSD elements for defining simple types. For a complete list please refer to the XSD standard. A complex type defines a new type which has elements and attributes declared in it.
| |
Description |
| complexType |
Defines a complex type, which can declare a number of elements or attributes. This is like your data structure in your traditional programming days. Here is an example:
<xs:element name="Address" type="AddressType"/> <xs:complexType name="AddressType"> <xs:sequence> <xs:element name="Country" type="xs:string"/> <xs:element name="State" type="xs:string"/> <xs:element name="ZIP" type="xs:string"/> <xs:element name="Address1" type="xs:string"/> </xs:sequence> <xs:attribute name="ID" type="xs:positiveInteger"/> <xs:attribute name="Zone" type ="xs:string"/> </xs:complexType>
It declares a new element named Address of the type AttressType. It then defines the type AddressType as a type with four elements – Country, State, ZIP and Address1. Same as with simple types you can nest the type definition within the element declaration:
<xs:element name="Address"> <xs:complexType> <xs:sequence> <xs:element name="Country" type="xs:string"/> <xs:element name="State" type="xs:string"/> <xs:element name="ZIP" type="xs:string"/> <xs:element name="Address1" type="xs:string"/> </xs:sequence> <xs:attribute name="ID" type="xs:positiveInteger"/> <xs:attribute name="Zone" type="xs:string"/> </xs:complexType> </xs:element>
As you can see in this case you don't specify a type attribute but rather nested within the element have the complex type defined. And also the complex type does not get any name attribute, so it can't be used for any other element. Attributes can never be of a complex type. |
| sequence |
A complex type can be a sequence, list or choice of elements. Any element you declare within a complex type needs to be within a "sequence", "all" or "choice" block. See example above. The "sequence" element specifies that the elements need to appear in the specified order in the XML document. The "sequence" element can have a minOccurs and maxOccurs which define how often that sequence can be present. The attributes you define for a complex type are outside of the "sequence" block. See again example above. The "sequence", "all" and "choice" elements are mutually exclusive. |
| all |
A complex type can be a sequence, list or choice of elements. Any element you declare within a complex type needs to be within a "sequence", "all" or "choice" block. The "all" element is used to declare a list of elements which can appear in no particular order within that complex type. The "all" element can have a minOccurs and maxOccurs which define how often that list can be present. The attributes you define for a complex type are outside of the "all" block. Here is an example:
<xs:complexType name="AddressType"> <xs:all> <xs:element name="Country" type="xs:string"/> <xs:element name="State" type="xs:string"/> </xs:all> </xs:complexType>
This means that the type AddressType has a Country and State element which can appear in any order. The "sequence", "all" and "choice" elements are mutually exclusive. |
| choice |
A complex type can be a sequence, list or choice of elements. Any element you declare within a complex type needs to be within a "sequence", "all" or "choice" block. The "choice" element specifies that only one of the elements is allowed within the complex type. The "choice" element can have a minOccurs and maxOccurs which define how often that "choice" block can be present. The attributes you define for a complex type are outside of the "choice" block. Here is an example:
<xs:complexType name="StateProvinceType"> <xs:choice> <xs:element name="State" type="xs:string"/> <xs:element name="Province" type="xs:string"/> </xs:choice> </xs:complexType>
This means that the StateProvinceType is only allowed to have a State or a Province element but not both. The "sequence", "all" and "choice" elements are mutually exclusive. |
|