Use XML in Scripting Can Help You Make your Dreams Come True
Use XML in Scripting Can Help You Make your Dreams Come True
This article is intended to be a fast primer to understanding XML in Scripting for anyone who hasn’t had the time to investigate its capabilities. XML is a text based file format that allows you to define your own structure within the conventions laid out by the XML in Scripting format. What follows will hopefully show you the benefits of using XML and how you can use it to manipulate data with minimum effort. How you then apply this to your technical solutions inside and outside of XSI is up to you.
The first part of this article looks at the ideas and concepts behind XML in Scripting data. The second half will show how easy it is to put it into practice in scripting and C++.
Table of Contents
- XML in Scripting -Making sure you have the latest version
- XML in Scripting –XML Basics
- XML in Scripting – Transforming XML with XSL Transforms
XML in Scripting -Making sure you have the latest version
Before we go on, you should make sure that you have the latest version of Microsoft’s XML support installed. Note that MSXML 4.0 sits on top of the previous versions of MSXML, and so it shouldn’t cause any conflicts with any other software.
To obtain the latest version, I’ve created a web page which should perform the installation quickly and easily through your web browser. Alternatively if you prefer, you can download it directly from Microsoft by visiting this web page here.
XML in Scripting –XML Basics
Without further delay, let’s look at a very simple example of XML in Scripting before we examine the benefits of using it:
< ?xml version=”1.0″ encoding=”utf-8″ ?><object> <name>robot</name> <file>c:\project\models\robot.emdl</file>
The first line is the declaration of the XML file type. At this stage, it’s not worth examining any further, except to say that this is how the file identifies itself as XML in Scripting.
The second line declares an opening tag with “object” as the identifier. The last line then specifies the corresponding closing tag. All tags must have an open and close tag, unless it is a special XML tag (indicated by a <?, see the first line) or if it contains no child data. If it contains no child data, then as a shortcut it can be written as ““.
The nice indented layout of the example is optional. It wouldn’t matter at all if there were no new lines or white space between the tags. You should note however, that this doesn’t mean that XML ignores the white space. Many XML parsers have by default the option to store the white space as extra child nodes of the parent node.js, so be warned!
The big advantage of using XML as a file format is scalability. This piece of jargon basically just means that it’s easy to expand on, without needing to have to plan for every eventuality at the outset. For example, if we later decide that we need to store information about the animation of the object, we can just add in an extra set of nodes like this:
< ?xml version=”1.0″ encoding=”utf-8″ ?><object> <name>robot</name> <file>c:\project\models\robot.emdl</file> <animation> <!– …other data… –> </animation>
Imagine that we had written a script with a function to parse the original node tree structure. If that old function had been written properly, it will now just ignore the new nodes. If we update the function, we can tell it to read the new nodes if they exist, or to use some predefined default values if they don’t.
We can also embed the node tree into an entirely different XML tree. For example, we might have an XML file that describes a scene:
< ?xml version=”1.0″ encoding=”utf-8″ ?><scene> <numobjects>1</numobjects> <file>c:\project\scenes\robotfight.scn</file> <objectlist> <object> <name>robot</name> <file>c:\project\models\robot.emdl</file> <animation> <!– …other data… –> </animation> </object> </objectlist>
The point to note here is that we can still use our original function to parse the node sub-tree, even though it’s located in an entirely different file format. This is why XML is so nice to use; it saves so much unnecessary work maintaining input/output (IO) routines.
Since the format of XML is standardised, it is possible for any application to read the data. Of course, it doesn’t necessarily mean that the application will know what to do with it or what the information means.
For example, if you double click on any valid XML document in Windows, Internet Explorer will start up and display the document, allowing you to expand and collapse the tree nodes by clicking on them. Internet Explorer doesn’t know what the XML data means, but that doesn’t stop it from presenting it to you in a convenient manner.
The way that Internet Explorer does this is by using what is known as a XML style sheet. These are themselves XML documents, but they also use a convention known as XSL. An XSL document has the ability to transform an XML document into another text based format. This new format can be anything of your choosing, but you’ll generally want it to be another XML document.
When Internet Explorer loads your XML file, it is using a default XSL document to transform it into XHTML (an HTML format that is compliant with XML standards). By writing your own XSL documents, you can easily edit and present your XML data with minimum effort.
Here’s an example based on the previous section. The XML file below has an extra XML tag at line 2 which specifies a default XSL document to transform it. This is only a recommendation to the application loading it, and does not mean that it cannot be transformed by a different XSL document.
< ?xml version=”1.0″ encoding=”utf-8″ ?>< ?xml-stylesheet type=”text/xsl” href=”TabulateRobots.xslt”?><scene> <objectlist> <object> <name>robot</name> <file>c:\project\models\robot.emdl</file> </object> <object> <name>droid</name> <file>c:\project\models\droid.emdl</file> </object> <object> <name>mech</name> <file>c:\project\models\mech.emdl</file> </object> </objectlist>
Below is an XSL transform document that will process this XML, and turn it into an XHTML document that displays the data as a table.
< ?xml version=”1.0″ encoding=”utf-8″ ?>
<xsl :stylesheet version=”1.0″ xmlns:xsl=”http://www.w3.org/1999/XSL/Transform”></xsl><xsl :template match=”/scene/objectlist”>
<table border=”1″ width=”500″ cellpadding=”5″>
<td width=”200″ align=”center”><b>Model Name</b></td>
<td align=”center”><b>File Path</b></td>
<xsl :for-each select=”object”> <tr> <td>
<xsl :value-of select=”name”/></td> <td>
<xsl :value-of select=”file”/></td>
If you copy and paste the examples into Robot.xml and TabulateRobots.xslt, then you can double click on Robot.xml to see the result, which should look something like this:
|Model Name||File Path|
This hopefully illustrates how easy it is to manipulate XML using XSL. In the above example, we chose to transform the data into XHTML. In fact, it’s possible to transform it into any text based format we choose by using “CDATA” tags. We could even create a script from an XML document like this:
< ?xml version=”1.0″ encoding=”utf-8″ ?><xsl :stylesheet version=”1.0″ xmlns:xsl=”http://www.w3.org/1999/XSL/Transform”><xsl :output omit-xml-declaration = “yes”/> </xsl><xsl :template match=”/scene/objectlist”> < ![CDATA[Class Robot Public name Public file End Class Sub PrintRobots ]]> < ![CDATA[Dim robotArray(]]><xsl :value-of select=”count(object)”/>< ![CDATA[) ]]> </xsl><xsl :for-each select=”object”> < ![CDATA[Set curRobot = new Robot curRobot.name = “]]><xsl :value-of select=”name”/>< ![CDATA[” curRobot.file = “]]><xsl :value-of select=”file”/>< ![CDATA[” Set robotArray(]]><xsl :value-of select=”position()-1″/>< ![CDATA[) = curRobot ]]> </xsl> < ![CDATA[For i = 0 To ]]><xsl :value-of select=”count(object)-1″/> < ![CDATA[ Msgbox robotArray(i).name +” : “+ robotArray(i).file
The above XSL transform will generate the following VBScript code:
Class Robot Public name Public fileEnd Class Sub PrintRobots Dim robotArray(3) Set curRobot = new Robot curRobot.name = “robot” curRobot.file = “c:\project\models\robot.emdl” Set robotArray(0) = curRobot Set curRobot = new Robot curRobot.name = “droid” curRobot.file = “c:\project\models\droid.emdl” Set robotArray(1) = curRobot Set curRobot = new Robot curRobot.name = “mech” curRobot.file = “c:\project\models\mech.emdl” Set robotArray(2) = curRobot For i = 0 To 2 Msgbox robotArray(i).name +” : “+ robotArray(i).file Next
A few things to note: Firstly, I’ve manually tidied up the formatting of the code (good luck trying to get it to look that neat straight from XSL). Generally it shouldn’t be a problem, unless of course you’re trying to generate Python code.
Secondly, it’s not hard to see that it doesn’t look like a particularly efficient way of doing things. For example, what happens if I have 12000 robots? That’d be one big VBScript!
Thirdly; great, so now we’ve got this script. How can we actually use it? Well, one easy way is to embed the script into HTML and let Internet Explorer run it. A better way though is to call it from your own scripts. That’s where the second half of this article begins.
Lastly, I should point out that generating a script from an XML document is not a particularly common thing to do. Usually, you will be generating a different XML structure, such as XHTML. This example is only to illustrate the flexability of using XML with XSL tranforms.
If you didn’t before, you should now have some sort of understanding of what XML is all about, and how it can be used to store and manipulate data. In the next section, I’ll be showing you how you can easily use it from VBScript and C++ without having to write your own XML parser.
SDK help from an external editor
You may be wondering why I’m using VBScript in this article. The main reason is that I experienced a few problems at work with JScript returning me the wrong interface from the COM object. I’ve since tested JScript successfully at home, and have come to the conclusion that it must be a problem with the machine at work. I still don’t know what the cause of this is, but until I do, I’m sticking with VBScript for any work with XML.
With the amount of people using XML in web applications, it shouldn’t come as any surprise that Microsoft support the use of XML in the form of COM objects (also known as ActiveX objects). This means that we can use these objects in scripts and let them do all the hard work. Let’s go straight to an example. Save the VBScript XSL document example above as “VBSScriptRobots.xslt”, and then copy and paste the code below into a new text document and rename it “GenerateScript.vbs”.
set xmlDoc = CreateObject(“Msxml2.DOMDocument.4.0”)xmlDoc.async=false xmlDoc.load “Robots.xml” set xslDoc= CreateObject(“Msxml2.DOMDocument.4.0”)xslDoc.async = falsexslDoc.load “VBSScriptRobots.xslt” funcStr = xmlDoc.transformNode(xslDoc) msgbox funcStrexecute funcStr
If you now run the script, it will display the generated code in a message box, and then it will run the function and present you with three message boxes displaying the XML data records.
I’m sure you’ll still be concerned about the fact that we generated the script with the robot names and filenames hard-coded into the generated function (if not, you should be!). A much better way of going about this task, would be to use the XML document object model to iterate through the XML tree structure, like this:
Set xmlDoc = CreateObject(“Msxml2.DOMDocument.4.0”)xmlDoc.async = FalsexmlDoc.load “Robots.xml” Set sceneNode = xmlDoc.documentElementSet objListNode = sceneNode.firstChildSet objectListNode = objListNode.childNodes For i=0 To objectListNode.length-1 Set objectNode = objectListNode.item(i) set nameNode = objectNode.firstChild set fileNode = nameNode.nextSibling msgbox nameNode.text &” : “& fileNode.text
This script doesn’t even need an XSL document. It simply traverses the XML node tree and iterates through each node, displaying the name and file as it goes.
When writing a function to parse an XML node, we should generally make sure it does the following:
|1)||Ignore any extra nodes that it doesn’t expect to encounter.|
|2)||Use default data when non-critical data is missing.|
|3)||If critical data is missing, attempt to continue to read the rest of the node wherever possible, and at the very least, log the error and make sure it is reported to the user. (There will of course be circumstances where it cannot continue reading data, and the function will have to return an error immediately.)|
|4)||Unless there is good reason, only parse the current level in the tree for each function. This will mean that you can chop and change your XML structure around easily and just change the order that the functions are being called. It also allows for easier code reuse.|
The following code does exactly the same job as the code you’ve just seen, but it also implements the above points. While the code is much larger (and slightly slower), it is definitely more robust and is capable of withstanding most syntactical errors in the format. It also provides helpful error messages should anything go wrong by encapsulating the output data inside a resusable class that stores the error information.
That should hopefully be enough to get you started with using XML in C++. I’ll save a more detailed XML C++tutorial for a future article. XML in Scripting, Use of XML in Scripting, , how to use XML in Scripting