Log parser includes the ability to output data results in a number of different formats. One of the more interesting for portability is the XML output format.
logparser -o:xml
An optional parameter to this is the structure parameter, which can be one of 1, 2, 3, or 4, with 1 being the default, and is defined in the help as “Type of XML structure.” Without testing it’s not very clear what exactly that means, so in the following article I’ll be covering that using a very simple query:
logparser -i:w3c -o:xml "SELECT date, COUNT(*) AS Requests INTO _test-1.xml FROM *ex1106*.log GROUP BY date" -compact:on
One important note before we continue is the compact
parameter, which removes all spacing (including new lines) from individual rows of data. By default this is off, and as far as I know there is no way to remove spaces within the lower elements, but keep new lines.
(If you wish to skip the examples the conclusion of the article wraps up the differences between the standard output and the other three structures.)
XML structure
Running the above query results in results similar to the following:
<?xml version="1.0" encoding="ISO-10646-UCS-2" standalone="yes" ?>
<!DOCTYPE ROOT[
<!ATTLIST ROOT DATE_CREATED CDATA #REQUIRED>
<!ATTLIST ROOT CREATED_BY CDATA #REQUIRED>
<!ELEMENT date (#PCDATA)>
<!ELEMENT Requests (#PCDATA)>
<!ELEMENT ROW (date, Requests)>
<!ELEMENT ROOT (ROW*)>
]>
<ROOT DATE_CREATED="2011-07-17 20:20:41" CREATED_BY="Microsoft Log Parser V2.2">
<ROW><date>2011-06-01</date><Requests>484</Requests></ROW>
<ROW><date>2011-06-02</date><Requests>529</Requests></ROW>
<ROW><date>2011-06-03</date><Requests>400</Requests></ROW>
<ROW><date>2011-06-04</date><Requests>113</Requests></ROW>
<ROW><date>2011-06-05</date><Requests>178</Requests></ROW>
<ROW><date>2011-06-06</date><Requests>422</Requests></ROW>
<ROW><date>2011-06-07</date><Requests>504</Requests></ROW>
<ROW><date>2011-06-08</date><Requests>417</Requests></ROW>
<ROW><date>2011-06-09</date><Requests>490</Requests></ROW>
<ROW><date>2011-06-10</date><Requests>429</Requests></ROW>
<ROW><date>2011-06-11</date><Requests>130</Requests></ROW>
<ROW><date>2011-06-12</date><Requests>160</Requests></ROW>
<ROW><date>2011-06-13</date><Requests>280</Requests></ROW>
<ROW><date>2011-06-14</date><Requests>407</Requests></ROW>
<ROW><date>2011-06-15</date><Requests>474</Requests></ROW>
<ROW><date>2011-06-16</date><Requests>463</Requests></ROW>
<ROW><date>2011-06-17</date><Requests>424</Requests></ROW>
<ROW><date>2011-06-18</date><Requests>256</Requests></ROW>
<ROW><date>2011-06-19</date><Requests>242</Requests></ROW>
<ROW><date>2011-06-20</date><Requests>442</Requests></ROW>
<ROW><date>2011-06-21</date><Requests>568</Requests></ROW>
<ROW><date>2011-06-22</date><Requests>431</Requests></ROW>
<ROW><date>2011-06-23</date><Requests>419</Requests></ROW>
<ROW><date>2011-06-24</date><Requests>436</Requests></ROW>
<ROW><date>2011-06-25</date><Requests>1566</Requests></ROW>
<ROW><date>2011-06-26</date><Requests>293</Requests></ROW>
<ROW><date>2011-06-27</date><Requests>395</Requests></ROW>
<ROW><date>2011-06-28</date><Requests>554</Requests></ROW>
<ROW><date>2011-06-29</date><Requests>601</Requests></ROW>
<ROW><date>2011-06-30</date><Requests>526</Requests></ROW>
</ROOT>
(ROOT and ROW can both be customized by the rootName and rowName parameters. For all of the below I’m keeping the defaults.)
Structure 2
Next we have -structure:2
, as in the following:
logparser -i:w3c -o:xml "SELECT date, COUNT(*) AS Requests INTO _test-2.xml FROM *ex1106*.log GROUP BY date" -structure:2 -compact:on
This results in the following output:
<?xml version="1.0" encoding="ISO-10646-UCS-2" standalone="yes" ?>
<!DOCTYPE ROOT[
<!ATTLIST ROOT DATE_CREATED CDATA #REQUIRED>
<!ATTLIST ROOT CREATED_BY CDATA #REQUIRED>
<!ELEMENT date (#PCDATA)>
<!ATTLIST date TYPE CDATA #REQUIRED>
<!ELEMENT Requests (#PCDATA)>
<!ATTLIST Requests TYPE CDATA #REQUIRED>
<!ELEMENT ROW (date, Requests)>
<!ELEMENT ROOT (ROW*)>
]>
<ROOT DATE_CREATED="2011-07-17 20:20:47" CREATED_BY="Microsoft Log Parser V2.2">
<ROW><date TYPE="TIMESTAMP">2011-06-01</date><Requests TYPE="INTEGER">484</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-02</date><Requests TYPE="INTEGER">529</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-03</date><Requests TYPE="INTEGER">400</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-04</date><Requests TYPE="INTEGER">113</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-05</date><Requests TYPE="INTEGER">178</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-06</date><Requests TYPE="INTEGER">422</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-07</date><Requests TYPE="INTEGER">504</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-08</date><Requests TYPE="INTEGER">417</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-09</date><Requests TYPE="INTEGER">490</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-10</date><Requests TYPE="INTEGER">429</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-11</date><Requests TYPE="INTEGER">130</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-12</date><Requests TYPE="INTEGER">160</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-13</date><Requests TYPE="INTEGER">280</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-14</date><Requests TYPE="INTEGER">407</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-15</date><Requests TYPE="INTEGER">474</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-16</date><Requests TYPE="INTEGER">463</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-17</date><Requests TYPE="INTEGER">424</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-18</date><Requests TYPE="INTEGER">256</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-19</date><Requests TYPE="INTEGER">242</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-20</date><Requests TYPE="INTEGER">442</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-21</date><Requests TYPE="INTEGER">568</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-22</date><Requests TYPE="INTEGER">431</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-23</date><Requests TYPE="INTEGER">419</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-24</date><Requests TYPE="INTEGER">436</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-25</date><Requests TYPE="INTEGER">1566</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-26</date><Requests TYPE="INTEGER">293</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-27</date><Requests TYPE="INTEGER">395</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-28</date><Requests TYPE="INTEGER">554</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-29</date><Requests TYPE="INTEGER">601</Requests></ROW>
<ROW><date TYPE="TIMESTAMP">2011-06-30</date><Requests TYPE="INTEGER">526</Requests></ROW>
</ROOT>
The only real difference between structure 1 and structure 2 is that a type attribute has been added to each inner element.
Structure 3
logparser -i:w3c -o:xml "SELECT date, COUNT(*) AS Requests INTO _test-3.xml FROM *ex1106*.log GROUP BY date" -structure:3 -compact:on
The above query results in output like the following:
<?xml version="1.0" encoding="ISO-10646-UCS-2" standalone="yes" ?>
<!DOCTYPE ROOT[
<!ATTLIST ROOT DATE_CREATED CDATA #REQUIRED>
<!ATTLIST ROOT CREATED_BY CDATA #REQUIRED>
<!ELEMENT FIELD (#PCDATA)>
<!ATTLIST FIELD NAME CDATA #REQUIRED>
<!ELEMENT ROW (FIELD, FIELD)>
<!ELEMENT ROOT (ROW*)>
]>
<ROOT DATE_CREATED="2011-07-17 20:20:53" CREATED_BY="Microsoft Log Parser V2.2">
<ROW><FIELD NAME="date">2011-06-01</FIELD><FIELD NAME="Requests">484</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-02</FIELD><FIELD NAME="Requests">529</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-03</FIELD><FIELD NAME="Requests">400</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-04</FIELD><FIELD NAME="Requests">113</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-05</FIELD><FIELD NAME="Requests">178</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-06</FIELD><FIELD NAME="Requests">422</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-07</FIELD><FIELD NAME="Requests">504</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-08</FIELD><FIELD NAME="Requests">417</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-09</FIELD><FIELD NAME="Requests">490</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-10</FIELD><FIELD NAME="Requests">429</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-11</FIELD><FIELD NAME="Requests">130</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-12</FIELD><FIELD NAME="Requests">160</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-13</FIELD><FIELD NAME="Requests">280</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-14</FIELD><FIELD NAME="Requests">407</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-15</FIELD><FIELD NAME="Requests">474</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-16</FIELD><FIELD NAME="Requests">463</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-17</FIELD><FIELD NAME="Requests">424</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-18</FIELD><FIELD NAME="Requests">256</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-19</FIELD><FIELD NAME="Requests">242</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-20</FIELD><FIELD NAME="Requests">442</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-21</FIELD><FIELD NAME="Requests">568</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-22</FIELD><FIELD NAME="Requests">431</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-23</FIELD><FIELD NAME="Requests">419</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-24</FIELD><FIELD NAME="Requests">436</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-25</FIELD><FIELD NAME="Requests">1566</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-26</FIELD><FIELD NAME="Requests">293</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-27</FIELD><FIELD NAME="Requests">395</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-28</FIELD><FIELD NAME="Requests">554</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-29</FIELD><FIELD NAME="Requests">601</FIELD></ROW>
<ROW><FIELD NAME="date">2011-06-30</FIELD><FIELD NAME="Requests">526</FIELD></ROW>
</ROOT>
Now instead of the element’s within the row having the name of the column they have a generic FIELD value, with name attributes containing the actual column name.
(As with ROW and ROOT, FIELD can be changed by passing a fieldName
parameter.)
Structure 4
Finally we have structure 4:
logparser -i:w3c -o:xml "SELECT date, COUNT(*) AS Requests INTO _test-4.xml FROM *ex1106*.log GROUP BY date" -structure:4 -compact:on
With sample output like the following:
<?xml version="1.0" encoding="ISO-10646-UCS-2" standalone="yes" ?>
<!DOCTYPE ROOT[
<!ATTLIST ROOT DATE_CREATED CDATA #REQUIRED>
<!ATTLIST ROOT CREATED_BY CDATA #REQUIRED>
<!ELEMENT FIELD (#PCDATA)>
<!ATTLIST FIELD NAME CDATA #REQUIRED>
<!ATTLIST FIELD TYPE CDATA #REQUIRED>
<!ELEMENT ROW (FIELD, FIELD)>
<!ELEMENT ROOT (ROW*)>
]>
<ROOT DATE_CREATED="2011-07-17 20:20:59" CREATED_BY="Microsoft Log Parser V2.2">
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-01</FIELD><FIELD NAME="Requests" TYPE="INTEGER">484</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-02</FIELD><FIELD NAME="Requests" TYPE="INTEGER">529</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-03</FIELD><FIELD NAME="Requests" TYPE="INTEGER">400</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-04</FIELD><FIELD NAME="Requests" TYPE="INTEGER">113</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-05</FIELD><FIELD NAME="Requests" TYPE="INTEGER">178</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-06</FIELD><FIELD NAME="Requests" TYPE="INTEGER">422</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-07</FIELD><FIELD NAME="Requests" TYPE="INTEGER">504</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-08</FIELD><FIELD NAME="Requests" TYPE="INTEGER">417</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-09</FIELD><FIELD NAME="Requests" TYPE="INTEGER">490</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-10</FIELD><FIELD NAME="Requests" TYPE="INTEGER">429</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-11</FIELD><FIELD NAME="Requests" TYPE="INTEGER">130</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-12</FIELD><FIELD NAME="Requests" TYPE="INTEGER">160</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-13</FIELD><FIELD NAME="Requests" TYPE="INTEGER">280</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-14</FIELD><FIELD NAME="Requests" TYPE="INTEGER">407</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-15</FIELD><FIELD NAME="Requests" TYPE="INTEGER">474</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-16</FIELD><FIELD NAME="Requests" TYPE="INTEGER">463</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-17</FIELD><FIELD NAME="Requests" TYPE="INTEGER">424</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-18</FIELD><FIELD NAME="Requests" TYPE="INTEGER">256</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-19</FIELD><FIELD NAME="Requests" TYPE="INTEGER">242</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-20</FIELD><FIELD NAME="Requests" TYPE="INTEGER">442</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-21</FIELD><FIELD NAME="Requests" TYPE="INTEGER">568</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-22</FIELD><FIELD NAME="Requests" TYPE="INTEGER">431</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-23</FIELD><FIELD NAME="Requests" TYPE="INTEGER">419</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-24</FIELD><FIELD NAME="Requests" TYPE="INTEGER">436</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-25</FIELD><FIELD NAME="Requests" TYPE="INTEGER">1566</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-26</FIELD><FIELD NAME="Requests" TYPE="INTEGER">293</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-27</FIELD><FIELD NAME="Requests" TYPE="INTEGER">395</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-28</FIELD><FIELD NAME="Requests" TYPE="INTEGER">554</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-29</FIELD><FIELD NAME="Requests" TYPE="INTEGER">601</FIELD></ROW>
<ROW><FIELD NAME="date" TYPE="TIMESTAMP">2011-06-30</FIELD><FIELD NAME="Requests" TYPE="INTEGER">526</FIELD></ROW>
</ROOT>
Structure 4 is basically a combination of structures 2 and 3, where the type of the value and name are moved into attributes.
Choosing a structure
Generally I find the default structure, 1, to be the best for most cases. However, if the file that will be parsing the XML will be changing the output based upon the type of data, structure 2 or 4 is a good choice, as it is far better to parse a defined TYPE attribute then trying to parse the actual value to guess the type.
I personally find structure 3 to be more difficult to parse, so I’d stick with either the default or type 2.
Differences wrap up
To conclude, the difference between the default structure and the other three are as follows:
Structure 2: A TYPE attribute is added to each element within the ROW element.
Structure 3: Each ROW element has a generic FIELD element with the name of the data stored in a new NAME attribute.
Structure 4: The combination of structures 2 and 3.
Article provided by James Skemp.