Parsing Nmap's XML output with XMLStarlet
Nmap (Network Mapper) is a free and open source utility for network exploration or security auditing (http://nmap.org). nmap can provide very detailed information in XML. Unfortunately, XML does not play nicely with line-based tools, making it difficult to access Nmap's details from a command line or shell script.
After trying nmap's -oG option you may have read the nmap manual and wondered what a "nearly as convenient" method of using XML might look like. The following is as near as I've come to convenience.
Start by saving scan results in XML:
nmap -PN -n -A -sU -sS -sV -oX scan_results.xml 192.168.1.1-100
We're going to use XPath to define and query elements of interest, so we'll need to be comfortable with the structure in scan_results.xml. I use the Firebug and FirePath extensions to Firefox to help expose the XML structure, identify XPaths, and verify the sanity of the XPaths I'm interested in. But as long as you can see a hierarchy and distinguish attribute names from values, you're good to go. Let's go.
Install xmlstarlet (http://xmlstar.sourceforge.net). xmlstarlet is a general purpose, pipe-friendly command line tool for XML. We're going to focus on its selection capabilities to query our scan's output. Using XPaths, we'll specify our elements of interest and report the values of attributes relative to those elements.
Given this simplified (but valid enough) result from scan_results.xml:
<nmaprun> <host> <address addr="192.168.1.1" addrtype="ipv4"/> <os> <osmatch name="Microsoft Windows" /> </os> </host> <host> <address addr="192.168.1.2" addrtype="ipv4"/> <os> <osmatch name="Linux 2.4" /> <osmatch name="Linux 2.6" /> </os> </host> </nmaprun>
..let's request "host addresses and matching OS names":
xmlstarlet sel -t -m "//host/os/osmatch" \ -v "concat(ancestor::host/address[@addrtype='ipv4']/@addr,' ',@name)" \ -n scan_results.xml 192.168.1.1 Microsoft Windows 192.168.1.2 Linux 2.4 192.168.1.2 Linux 2.6
xmlstarlet sel -t is the prelude to a selection. -m specifies the XPath to what I consider "pivot points" or focal elements. -v "concat(..)" provides XPaths from the pivot points to the values in a record. -n appends a new line after each record.
I'm using //host/os/osmatch as the pivot from which to query higher-level ancestor elements (the corresponding host in ancestor::host/address[@addrtype='ipv4']/@addr) since I haven't found a nice way to combine values of multiple children in a single line record. The result is a map that can be managed using grep, awk, and friends.
xmlstarlet supports a number of functions in the http://exslt.org namespace (ex. xmlns:str="http://exslt.org/strings"):
xmlstarlet sel -t -m "//host/ports/port[service[@name='http']]/script" \ -v "concat(ancestor::host/address[@addrtype='ipv4']/@addr,' ', \ ../@protocol,' ',../@portid,' ',../service/@name,' ', \ @id,' ',str:replace(@output,'
',' '))" \ -n scan_results.xml 192.168.1.5 tcp 5900 vnc realvnc-auth-bypass Vulnerable 192.168.1.6 tcp 22 ssh sshv1 Server supports SSHv1 192.168.1.7 tcp 7778 http html-title Oracle HTTP Server Index
Here, str:replace() eliminates new lines from NSE output.
xmlstarlet does not support regular expressions out of the box, but since the point was to make XML grep-able, we can leave the regexing to someone else.
A note on xml2..
xml2 (http://ofb.net/~egnor/xml2/) is my go-to tool for XML orientation on the command line. xml2 decomposes a tree into a list of node paths; the result of xml2 < scan_results.xml is:
/nmaprun/host/address/@addr=192.168.1.1 /nmaprun/host/address/@addrtype=ipv4 /nmaprun/host/os/osmatch/@name=Microsoft Windows /nmaprun/host /nmaprun/host/address/@addr=192.168.1.2 /nmaprun/host/address/@addrtype=ipv4 /nmaprun/host/os/osmatch/@name=Linux 2.4 /nmaprun/host/os/osmatch /nmaprun/host/os/osmatch/@name=Linux 2.6
Host addresses and matching OS names can't be mapped using a straightforward grep because the address and os nodes are siblings. This is what prompted the search for XPath assistance that led to xmlstarlet.