XML
What is XML?
XML stands for EXtensible Markup Language
XML is a markup language much like HTML.
XML was designed to describe data.
XML tags are not predefined in XML. You must define your own tags.
XML is self describing.
XML uses a DTD (Document Type Definition) to formally describe the data.
The main difference between XML and HTML
XML is not a replacement for HTML.
XML and HTML were designed with different goals:
XML was designed to describe data and to focus on what data is.
HTML was designed to display data and to focus on how data looks.
HTML is about displaying information, XML is about describing information.
XML is extensible
The tags used to markup HTML documents and the structure of HTML documents are predefined. The author of HTML documents can only use tags that are defined in the HTML standard.
XML allows the author to define his own tags and his own document structure.
XML is a complement to HTML
It is important to understand that XML is not a replacement for HTML. In the future development of the Web it is most likely that XML will be used to structure and describe the Web data, while HTML will be used to format and display the same data.
XML in future Web development
We have been participating in XML development since its creation. It has been amazing to see how quickly the XML standard has been developed, and how quickly a large number of software vendors have adopted the standard.
We strongly believe that XML will be as important to the future of the Web as HTML has been to the foundation of the Web. XML is the future for all data transmission and data manipulation over the Web
How can XML be used?
XML can keep data separated from your HTML
XML can be used to store data inside HTML documents
XML can be used as a format to exchange information
XML can be used to store data in files or in databases
XML can keep data separated from your HTML
HTML pages are used to display data. Data is often stored inside HTML pages. With XML this data can now be stored in a separate XML file. This way you can concentrate on using HTML for formatting and display, and be sure that changes in the underlying data will not force changes to any of your HTML code.
XML can also store data inside HTML documents
XML data can also be stored inside HTML pages as "Data Islands". You can still concentrate on using HTML for formatting and displaying the data.
XML can be used to exchange data
In the real world, computer systems and databases contain data in incompatible formats. One of the most time consuming challenges for developers has been to exchange data between such systems over the Internet. Converting the data to XML can greatly reduce this complexity and create data that can be read by different types of applications.
XML can be used to store data
XML can also be used to store data in files or in databases. Applications can be written to store and retrieve information from the store, and generic applications can be used to display the data.
.
XML Syntax
An example XML document:
<?xml version="1.0"?> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> |
The first line in the document: The XML declaration should always be included. It defines the XML version of the document. In this case the document conforms to the 1.0 specification of XML:
<?xml version="1.0"?> |
The next line defines the first element of the document (the root element):
<note> |
The next lines defines 4 child elements of the root (to, from, heading, and body):
<to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> |
The last line defines the end of the root element:
</note> |
All XML elements must have a closing tag
In HTML some elements do not have to have a closing tag. The following code is legal in HTML:
<p>This is a paragraph <p>This is another paragraph |
In XML all elements must have a closing tag like this:
<p>This is a paragraph</p> <p>This is another paragraph</p> |
-
XML tags are case sensitive
XML tags are case sensitive. The tag <Letter> is different from the tag <letter>.Opening and closing tags must therefore be written with the same case:
<Message>This is incorrect</message> |
<message>This is correct</message> |
-All XML elements must be properly nested
In HTML some elements can be improperly nested within each other like this:
<b><i>This text is bold and italic</b></i> |
In XML all elements must be properly nested within each other like this
<b><i>This text is bold and italic</i></b> |
-All XML documents must have a root tag
All XML documents must contain a single tag pair to define the root element. All other elements must be nested within the root element. All elements can have sub (children) elements. Sub elements must be in pairs and correctly nested within their parent element:
<root> <child> <subchild> </subchild> </child> </root> |
Attribute values must always be quoted
XML elements can have attributes in name/value pairs just like in HTML. In XML the attribute value must always be quoted. Study the two XML documents below. The first one is incorrect, the second is correct:
<?xml version="1.0"?>
<note date=12/11/99>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
<?xml version="1.0"?>
<note date="12/11/99">
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>
XML Attributes
XML Attributes
XML attributes are normally used to describe XML elements, or to provide additional information about elements. From HTML you can remember this construct: <IMG SRC="computer.gif">. In this HTML example SRC is an attribute to the IMG element. The SRC attribute provides additional information about the element.
Attributes are always contained within the start tag of an element. Here are some examples:
HTML examples:
<img src="computer.gif"> <a href="demo.asp"> XML examples:
<file type="gif"> <person id="3344"> |
Usually, or most common, attributes are used to provide information that is not a part of the content of the XML document. Did you understand that? Here is another way to express that: Often attribute data is more important to the XML parser than to the reader. Did you understand it now? Anyway, in the example above, the person id is a counter value that is irrelevant to the reader, but important to software that wants to manipulate the person element.
Use of Elements vs. Attributes
Take a look at these examples:
Using an Attribute for sex:
<person sex="female"> <firstname>Anna</firstname> <lastname>Smith</lastname> </person>
Using an Element for sex:
<person> <sex>female</sex> <firstname>Anna</firstname> <lastname>Smith</lastname> </person> |
In the first example sex is an attribute. In the last example sex is an element. Both examples provides the same information to the reader.
There are no fixed rules about when to use attributes to describe data, and when to use elements. My experience is however; that attributes are handy in HTML, but in XML you should try to avoid them, as long as the same information can be expressed using elements.
Here is another example, demonstrating how elements can be used instead of attributes. The following three XML documents contain exactly the same information. A date attribute is used in the first, a date element is used in the second, and an expanded date element is used in the third:
<?xml version="1.0"?> <note date="12/11/99"> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> <?xml version="1.0"?> <note> <date>12/11/99</date> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> <?xml version="1.0"?> <note> <date> <day>12</day> <month>11</month> <year>99</year> </date> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> |
Avoid using attributes? (I say yes!)
Why should you avoid using attributes? Should you just take my word for it? These are some of the problems using attributes:
attributes can not contain multiple values (elements can)
attributes are not expandable (for future changes)
attributes can not describe structures (like child elements can)
attributes are more difficult to manipulate by program code
attribute values are not easy to test against a DTD
If you start using attributes as containers for XML data, you might end up with documents that are both difficult to maintain and to manipulate. What I'm trying to say is that you should use elements to describe your data. Use attributes only to provide information that is not relevant to the reader. Please don't end up like this:
<?xml version="1.0"?> <note day="12" month="11" year="99" to="Tove" from="Jani" heading="Reminder" body="Don't forget me this weekend!"> </note> |
This don't look much like XML. Got the point?
An Exception to my Attribute rule
Rules always have exceptions. My rule about not using attributes has one too:
Sometimes I assign ID references to elements in my XML documents. These ID references can be used to access XML element in much the same way as the NAME or ID attributes in HTML. This example demonstrates this:
<?xml version="1.0"?> <messages> <note ID="501"> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>
<note ID="502"> <to>Jani</to> <from>Tove</from> <heading>Re: Reminder</heading> <body>I will not!</body> </note> </messages> |
The ID in these examples is just a counter, or a unique identifier, to identify the different notes in the XML file
XML Validation
"Well Formed" XML documents
A "Well Formed" XML document is a document that conforms to the XML syntax rules that we described in the previous chapter.
The following is a "Well Formed" XML document:
<?xml version="1.0"?> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> |
"Valid" XML documents
A "Valid" XML document is a "Well Formed" XML document which conforms to the rules of a Document Type Definition (DTD).
The following is the same document as above but with an added reference to a DTD:
<?xml version="1.0"?> <!DOCTYPE note SYSTEM "InternalNote.dtd"> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note> |
XML in Netscape and in Explorer
XML in this Web
Many applications support XML in a number of ways. In this Web we focus on the XML support in Internet Explorer 5.0. Some visitors have complained about this, but we don't do it because IE5 is the only performer in the XML field. We do it because it is the only practical way to demonstrate XML to a large audience over the Web.
So - while we are waiting for Netscape - most of our software examples will work only with IE5. If you want to learn XML the easy way - with lots of examples for you to try out - you will have to live with that.
XML in Netscape Navigator 5
Netscape has promised full XML support in its new Navigator 5 browser. We hope that this will include standard support for the W3C XML, just as it does in Internet Explorer 5.
Based on previous experience we can only hope that Navigator and Explorer will be fully compatible in the future XML field.
Your option at the moment - if you want to work with Netscape and XML - is to work with XML on your server and transform your XML to HTML before it is sent to the browser. You can read more about transforming XML to HTML in the chapters about XSL.
XML in Internet Explorer 5
Internet Explorer 5 fully supports the international standards for both XML 1.0 and the XML Document Object Model (DOM). These standards are set by the World Wide Web Consortium (W3C).
You can download IE5 from http://www.microsoft.com/windows/ie/
Internet Explorer 5.0 has the following XML support:
Viewing of XML documents
Full support for W3C DTD standards
XML embedded in HTML as Data Islands
Binding XML data to HTML elements
Formatting XML with XSL
Formatting XML with CSS
Support for CSS Behaviors
Access to the XML DOM
Viewing XML with Internet Explorer 5
You can use IE5 to view an XML document just as you view any HTML page. There are several ways to open an XML document. You can click on a link, type the URL into the address bar, double-click on an XML document in a folder, and so on.
If you point IE5 to an XML document, IE5 will display the document with its root element and child elements expanded. A plus (+) or minus sign (-) to the left of the XML elements can be clicked to expand or collapse the element structure, and if you only want to view the raw XML source, you can select "View Source" from the browser menu.
EXAMPLE:
CD Catalog:
<?xml version="1.0" encoding="ISO8859-1" ?>
- <CATALOG>
- <CD>
<TITLE>Empire Burlesque</TITLE>
<ARTIST>Bob Dylan</ARTIST>
<COUNTRY>USA</COUNTRY>
<COMPANY>Columbia</COMPANY>
<PRICE>10.90</PRICE>
<YEAR>1985</YEAR>
</CD>
- <CD>
<TITLE>Hide your heart</TITLE>
<ARTIST>Bonnie Tylor</ARTIST>
<COUNTRY>UK</COUNTRY>
<COMPANY>CBS Records</COMPANY>
<PRICE>9.90</PRICE>
<YEAR>1988</YEAR>
</CD>
- <CD>
<TITLE>Greatest Hits</TITLE>
<ARTIST>Dolly Parton</ARTIST>
<COUNTRY>USA</COUNTRY>
<COMPANY>RCA</COMPANY>
<PRICE>9.90</PRICE>
<YEAR>1982</YEAR>
</CD>
+ <CD>
<TITLE>Still got the blues</TITLE>
<ARTIST>Gary More</ARTIST>
<COUNTRY>UK</COUNTRY>
<COMPANY>Virgin redords</COMPANY>
<PRICE>10.20</PRICE>
<YEAR>1990</YEAR>
</CD>
- <CD>
<TITLE>Eros</TITLE>
<ARTIST>Eros Ramazzotti</ARTIST>
<COUNTRY>EU</COUNTRY>
<COMPANY>BMG</COMPANY>
<PRICE>9.90</PRICE>
<YEAR>1997</YEAR>
</CD>
</CATALOG>
CSS STYLE SHEET FOR THIS FILE
CATALOG
{
background-color: #ffffff;
width: 100%;
}
CD
{
display: block;
margin-bottom: 30pt;
margin-left: 0;
}
TITLE
{
color: #FF0000;
font-size: 20pt;
}
ARTIST
{
color: #0000FF;
font-size: 20pt;
}
COUNTRY,PRICE,YEAR,COMPANY
{
Display: block;
color: #000000;
margin-left: 20pt;
}