Generating Custom Reports Using XMILE
XMILE is an open standard for describing system dynamics models in XML. Version 10 of iThink and STELLA output their models in the XMILE format. One of the advantages of XML is that it is a text-based format that can be easily queried and manipulated. This post will show you how to use XMLStarlet, a free XML command line management tool available for Windows, Macintosh, and Linux, to easily extract information from a XMILE model. It will also demonstrate how to modify the XML style sheet (XSLT) generated by XMLStarlet to create custom HTML reports.
Our goal is to create a report that lists the stocks, flows, and converters in the susceptible-infected-recovered (SIR) model of infection shown below (available by clicking here). Each model variable will be listed with its own equation and sorted by name.
XMLStarlet uses the select command (sel) for making queries to an XML file and formatting the results. We will use all of the following select command options:
-t (template): define a set of rules (below) to be applied to the XML file
-m “XPath query” (match): find and select a set of nodes in the XML file
-s <options> “XPath expression” (sort): sort selected nodes by XPath expression
-v “XPath expression” (value): output value of XPath expression
-o “text” (output): output the quoted text
-n (newline): start a new line in the output
Reporting Stock Names
Let’s start by outputting the names of the stocks in the model. In a XMILE file, stocks are identified by the <stock> tag, which is nested inside the <xmile> and <model> tags:
<xmile …> <model> <stock name="Infected"> <eqn>1</eqn> </stock> </model> </xmile>
There is one <stock> tag for every stock in the model and each stock has, at a minimum, both a name (in the “name” attribute) and an initialization equation (in the <eqn> tag). To get the names of all stocks in the model, we can build a template using these XMLStarlet command options:
sel –t -m “_:xmile/_:model/_:stock” -v “@name” -n
The “sel” chooses the select command and the –t begins the template (the set of rules used to extract and format information from the XML file). The –n at the end puts each stock name on its own line.
The –m option defines the XML path to any stock from the root. In this case, the –m option is selecting all the XML nodes named stock (i.e., <stock> tags) that are under any <model> tags in the <xmile> tag. From the XMILE file, one might expect the XML path to be “xmile/model/stock,” but the tags in the XMILE file are in the XMILE namespace and XPath, which is being used for this query, requires namespaces to be explicitly specified. Luckily, XMLStarlet, starting in version 1.5.0, allows us to use “_” for the name of the namespace used by the XML file, in this case the XMILE namespace. Thus, every XMILE name in a query must be preceded by “_:”.
Finally, the –v option allows us to output the name of each node selected with -m (stocks, in this case). The “@” tells XPath that “name” is an attribute, not a tag, i.e., it is of the form name=”…” rather than <name>…</name>.
To build a full command, we need to add the path to XML Starlet to the beginning and the name of the XML file being queried to the end:
XMLStarlet_path/xml <options above> SIR.stmx
The entire command without the path to XMLStarlet is:
xml sel -t -m “_:xmile/_:model/_:stock” -v “@name” -n SIR.stmx
This command produces the following output:
Sorting the Names
We can sort the nodes after we match them and before we output anything from the selected nodes. The sort command in XMLStarlet has three options represented using single letters separated by colons (:). These options, in this order, control whether the sort is ascending or descending, whether we are sorting text or numbers, and whether uppercase letters should come before lowercase (or vice-versa). Fortunately, the defaults are all what we want (ascending text, uppercase first) so we do not need to specify the specific values. We still need to specify each option, but can use “-” to indicate we wish to use the defaults.
To the above XMLStarlet options, we need to add the sort command (shown in italics):
sel –t -m “_:xmile/_:model/_:stock” –s –:-:- “@name” -v “@name” –n
This tells XMLStarlet to sort by the name of each stock, yielding this output:
Adding in the Equation
Next, we wish to add the equation after each name. Since the equation is in its own tag, we can just extract it using –v. However, we need to output some text between the name and the equation, which we can do with –o:
-o “: ” -v “_:eqn”
The text immediately following the –o option will appear after each name. Note we have to qualify “eqn” with the namespace (_). These options belong at the end, right before the -n (shown in italics):
sel -t -m “_:xmile/_:model/_:stock” -s -:-:- “@name” -v “@name” -o “: ” -v “_:eqn” -n
The output is now:
Inserting a Header
We eventually want to show all stocks, flows, and converters, so it is a good idea to identify the above list as stocks. We can easily do this by adding the header “Stocks:” at the top. Place it right after the –t option in the command to XMLStarlet:
sel -t -o “Stocks:” -n -m “…” -s -:-:- “@name” -v “@name” -o “: ” -v “_:eqn” -n
The output from these set of options (with the correct XPath after –m) is:
Including Flows and Converters
Flows and converters are identified in the XMILE file with the <flow> and <aux> tags, respectively. Just like the stock, they each have a name attribute and an <eqn> tag. Therefore they can be included by repeating the same options we used for the stock, with minor edits. The entire set of options to the select command (with truncated paths – also note all options must appear on the same command line) is:
-t -o “Stocks:” -n -m “…/_:stock” -s -:-:- “@name” -v “@name” -o “: ” -v “_:eqn” -n
-t -n -o “Flows:” -n -m “…/_:flow” -s -:-:- “@name” -v “@name” -o “: ” -v “_:eqn” -n
-t -n -o “Converters:” -n -m “…/_:aux” -s -:-:- “@name” -v “@name” -o “: ” -v “_:eqn” -n
The extra –n options (start a new line) before flows and converters leaves a blank line between each set. Using this set of options on SIR.stmx, we get the following output:
One Last Whistle
XSLT is very flexible, allowing many more enhancements to our report. Why don’t we include the total number of entities in the model?
The XPath function “count” returns the number of nodes that match the XPath query. Thus the following command counts the number of stock nodes in our model file:
We can include the total number of entities in the model at the end of the report by including these options at the end (before the filename and all on the same command line):
-t -n -o “Total Entities: ” -v “count(_:xmile/_:model/_:stock)
+ count(_:xmile/_:model/_:flow) + count(_:xmile/_:model/_:aux)” -n
This adds the following output to the end of our entity list:
Total Entities: 8
The complete XMLStarlet command (without the path to XMLStarlet as it will vary from machine to machine) can be downloaded by clicking here.
Converting to HTML
One of the most powerful features of XML is that we can readily transform the data into an alternate format, as we did above. We can also add style information to display the data in rich text format in a browser.
The file that contains the rules to transform an XML file is called an XSLT file. XMLStarlet creates an XSLT file behind the scenes whenever we give it a command. Using the –C option, we can tell XMLStarlet to output the XSLT file that generates the above output. We can then edit the XSLT file to add style information. In particular, we would like the headings to be shown in bold and the variable names to be shown in italic.
Run the same command as before, but add –C directly after the “sel” command (before the first –t). Redirect the output to a file called SIR.xsl. The command has this form:
xml sel –C -t … –n SIR.stmx > SIR.xsl
You will also need to make a copy of SIR.stmx named SIR.xml because we need to add a command to the XML file so that it uses the XSLT. In addition, the browser needs to know it is an XML file, which it does by default for files ending with .xml.
Open SIR.xml in any text editor and insert this line right after the first line (just above the <xmile> tag):
<?xml-stylesheet type=”text/xsl” href=”SIR.xsl”?>
This tells any program that opens the XML file to use the XSLT file named SIR.xsl. The XML and initial XSLT files are available by clicking here.
If you open the XML file in your browser, you will see something like this:
Stocks: Infected: 1 Recovered: 0 Susceptible: 100 Flows: becoming\ninfected: infection_rate*Infected*Susceptible entering\ninfected_area: 0 recovering: recovery_rate*Infected Converters: infection_rate: 0.005 recovery_rate: 1/4
This is not quite what we had in mind. The main problem is that XMLStarlet outputs plain text, not HTML. One way to fix this would be to put everything in a <pre> block, but then we cannot format anything within it. So we have to edit the XSLT file slightly to make it display properly.
Depending on your browser, you will need to first remove some code from the XSLT file. Open SIR.xsl in any text editor. If you are using Firefox, there is an extra dummy namespace in the header line (line 2, starting the xsl:stylesheet tag) that makes Firefox ignore requests to interpret the result as HTML. Remove this text from that line:
If you are using Internet Explorer, it gets confused by these lines at the bottom of the file in the template named value-of-template. Nothing we are doing uses this code, so just remove these lines if you need this to work with Internet Explorer:
<xsl:for-each select="exslt:node-set($select)[position()>1]"> <xsl:value-of select="' '"/> <xsl:value-of select="."/> </xsl:for-each>
To get the browser to interpret the results as HTML, add the following attribute to the xsl:output tag (line 3) – just insert it right after xsl:output:
Now we can put the newlines back in, using HTML. Replace all occurrences of the following with <br/>:
<xsl:value-of select=”‘ '”/>
If you save and open SIR.xml again, you should now see the same output XMLStarlet gave. To add style information, find the headers “Stocks:”, “Flows:”, “Converters:”, and “Total Entities:”. These will each be enclosed in an <xsl:text> tag. Insert <strong> at the beginning of each of these three lines and </strong> at the end. If you open SIR.xml in your browser now, the headings should all be bold.
Italicizing the variable names is slightly more difficult because the generated XSLT uses a template to output them. The invocation of this template requires three lines instead of one, so it is harder to pick them out:
<xsl:call-template name="value-of-template"> <xsl:with-param name="select" select="@name"/> </xsl:call-template>
The key to finding them is to look for the use of the attribute “@name” inside an xsl:call-template. Surround all three lines of each of the three instances you find with <em> and </em> and save. The final XSLT file is available by clicking here.
Congratulations! If you open SIR.xml again, it should now be formatted as we intended:
Total Entities: 8
Advanced: But What About Those Newlines in Names?
You may have noticed that two of the flow names, rather than using the character “_” between words, use the sequence “\n”. In XMILE, this is the newline character and appears in a name when the user inserted a line break (by pressing Enter or Return). This makes it much harder to read the names in the generated report. It also does not correspond to the name that is used in the equation. Unfortunately, XMLStarlet does not provide a way to substitute these characters. In addition, while XSLT 2.0 supports the XPath function replace() to do this, browsers only support XSLT 1.0, which does not include that function. To replace “\n” in names with “_”, we have to include a template to do the replacement in our XSLT file and then we must change our XSLT code to use that template.
Thankfully, we do not have to write such a template as many are available on the Internet. The template used for this example is called string-replace-all and comes from here. Simply copy the entire template (at the top of the post) and paste it at the very bottom of your XSLT file (just above the </xsl:stylesheet> tag).
Since all names and equations are output using the template named value-of-template, the easiest way to affect all names is to modify value-of-template to call string-replace-all. This will have no effect on equations as they have no newlines in them. After the changes recommended above for Internet Explorer, value-of-template looks like this:
<xsl:template name="value-of-template"> <xsl:param name="select"/> <xsl:value-of select="$select"/> </xsl:template>
To replace the newlines, the xsl:value_of line will be changed to call the string-replace-all template to replace “\n” with “_”:
<xsl:template name="value-of-template"> <xsl:param name="select"/> <xsl:call-template name="string-replace-all"> <xsl:with-param name="text" select="$select"/> <xsl:with-param name="replace" select="'\n'"/> <xsl:with-param name="by" select="'_'"/> </xsl:call-template> </xsl:template>
The output generated using this new XSLT file (available by clicking here) is:
Total Entities: 8
XMLStarlet is a useful tool to extract information from a XMILE model file and to generate an initial XSLT file. To create HTML reports from a XMILE model requires some editing of the generated XSLT file. What you can accomplish in this way is limited only by your ability to find or write the required XSLT 1.0. The effort, however, is well worth it; once you create an XSLT file that generates the report you desire, that same XSLT file can be used to generate a report in exactly the same format from any XMILE model. Try it out by applying the SIR.xsl file created in this post to any other model!
Note: This post uses the latest version of XMLStarlet, version 1.5.0. For Windows, it can be downloaded directly from the official XMLStarlet website, http://xmlstar.sourceforge.net/. For Macintosh, it is available through Fink, a program used to distribute Unix-family open source software for Macintosh (http://fink.sourceforge.net/). Since XMLStarlet is a command line utility, it must be run from the command prompt in Windows or the Terminal program on Macintosh.
Disclaimer: This method will not find non-apply-to-all arrays as they use the <array> tag no matter the entity type.