Wifi Network Scanning Results Analysis

Analyzing cumulative Nmap network scanning results, as documented in the previous post, reminded me that I've collected a sizable amount of data on 802.11 wireless networks over the past six months. I've used a free Android app, Wigle Wifi, to enumerate WiFi networks using my phone as I travel around Austin - by foot, bike, car, bus, and train.

I classify Wigle Wifi as an 'active stumbler' tool - i.e., the program sends 802.11 broadcast probe requests, and listens for the responses; it also processes broadcast beacons. This differentiates it from a 'passive scanner'. I came to this conclusion by capturing and analyzing Wigle's active network traffic using tcpdump, and cross-referencing it with the tool's output.

Helpfully, Wigle logs cumulative results directly to a SQLite database, in addition to having the capability to export individual scan results to KML output files, which can be imported into Google Earth or similar applications for viewing.

The database tables, and the schema for the relevant 'network' table is shown below:

Again, a few simple queries present a good summary of the data I've collected -- the total number of networks discovered (over 20,000), the number of unique network names (13,282), the most common network names (mostly default manufacturer names, but also several referring to nearby educational institutions), and the most common radio frequency utilized (2437 MHz - aka channel 6).

I was more interested to learn about the security protections implemented across this wide base of networks, so I enumerated all of the 'capabilities' advertised by at least a single network. The results indicate a wide variety of formats for specifying security implementations, but generally they can be broken into four categories: no security, WEP encryption, WPA pre-shared key-based encryption, and WPA enterprise encryption.

I ran an additional query to identify the most commonly advertised security capabilities. I was disappointed to see the top two results as WEP, and '' (none). However, the following six entries were variations of WPA encryption, so I decided to do a more general comparison: a count of WPA-enabled networks versus non-WPA-enabled (WEP or no security) networks.

These results more closely reflect the reality of the situation - a slight preference for WPA-based encryption in all forms (WPA, WPA2, PSK-based, EAP-based, etc.), with 56% of the total networks implementing some variation. These results are slightly more positive as compared to a WiFi security survey of financial districts conducted by AirTight networks in 2009, which identified 43% of networks utilizing some form of WPA.

Granted, in many instances WPA-protected networks are still vulnerable to a variety of attacks. However, properly implemented WPA-based encryption significantly raises the bar for attacks against WiFi networks, so this is a positive trend to observe.


Network Scanning Results Analysis

I recently decided to take a closer look at the moderately large amounts of network scanning results that I've collected over the past few years of penetration testing work.

The impetus for this research was recalling Fyodor's undertaking in 2008 to scan 25 million IP addresses using Nmap, and analyze the results (which he presented at Black Hat (pdf)). The purpose of his scanning activity was to identify the most commonly open TCP and UDP ports on the Internet, in order to improve the efficiency of Nmap's default port scanning configuration. Fyodor's findings of the most frequently open TCP ports is shown below:

Since then, the 'nmap-services' file included in the Nmap distribution has included frequency data for each port - indicating the percentage of the scans in which the port was found to be open.

I thought it would be interesting to similarly identify the most commonly open ports in the Nmap scanning results I'd generated over the past few years, for two reasons: to identify unique characteristics of the customer base for which I provided services; and to identify possible trends in the availability of ports since Fyodor's research in 2008. Note that these scanning results were generated from entirely external (outsider perspective) assessments.

Since Nmap does not include built-in integration with databases, I began by investigating available tools to import Nmap results data into a database. I learned about Nmap XML2SQL, from Redspin, which uses Perl's Database Interface and Nmap Parser modules, and stores data in SQLite. I also tried out PBNJ, but encountered errors in the data import process (likely due to variations in Nmap result files based on different versions). I considered writing a custom script using Perl's Nmap Parser module, but could not invest enough time into the project. Then I discovered rubynmapsqlite, which builds off of the XML2SQL database schema, uses the rubynmap parsing library, and stores data in SQLite. A few test runs confirmed that this tool suited my purposes well, so I created a very simple Bash shell script to locate every Nmap XML data file within the appropriate directory, and iteratively import them into a single database. (Fortunately, I'd always used the '-oA' option with Nmap to output results in all available formats, so each scan produced an XML output file)

The resulting database tables, and the schema for the relevant 'ports' table is shown below:

A few simple queries helped paint a picture of the versions of Nmap I'd used over the past three years (quite mixed, 5.00 most commonly), the total number of unique IP addresses scanned (over 470,000), the total number of open ports located (over 48,000), the number of unique open ports found (12,216), and the number of unique hosts with open ports found (7,352).

These results indicate that roughly 1.5% of scanned IP addresses belonged to a live host with at least one externally accessible service. On average, each of these live hosts had approximately six services accessible (though it later became apparent that this figure was significantly skewed by several proxy servers, with several thousand open ports each).

Next I turned my attention to identifying the most commonly open ports. Before I post the results, a few caveats:
  • services protected with TCP wrappers are generally not publicly accessible, but are identified as 'open' in default Nmap scans using the '-sS' (TCP SYN scan) option. I use this option in the vast majority of scans, mainly for performance reasons. Additionally, when the '-sV' (service version detection) option was enabled, TCP-wrapped services were inserted twice into the SQLite database. I attempted to remove all duplicate entries, but retained the TCP-wrapped service entries, choosing to annotate the affected ports in the results list.
  • transport layer protocols were not inserted into the SQLite database, for reasons unclear. Therefore, port numbers are not differentiated between TCP and UDP. In most cases, the correct transport protocol will be apparent (e.g., TCP for port 80, UDP for port 161), but in some cases it is indeterminate (e.g., port 53).
  • I did not run UDP port scans as frequently or consistently as the default TCP port scans over the past several years, partly due to time constraints, as well as the fact that UDP port scanning with Nmap prior to version 5.20 had severely limited accuracy. Alternative, service-specific tools were generally used to test individual UDP services, as needed. Therefore, UDP port counts are likely underrepresented in the results.
  • All 65,536 TCP ports were scanned in each assessment.

Without further ado, the top 20 most commonly open ports from my dataset:
(21 ports are listed, since one of the entries is a TCP-wrapped service)

A comparison of my results (on the right) to Fyodor's 2008 results (left):
(Column E in my result set on the right indicates the "ranking" of the associated port in Fyodor's 2008 results - I included this information for ports that made a significant jump)

Many of the top ports are closely aligned between the lists - HTTP, HTTPS, Telnet, DNS, SMTP. More interesting may be the differences. Firstly, UDP ports are much more prominent in Fyodor's results. I believe this is due in part to less frequently performed UDP scans on my behalf, but also the limited accuracy in detecting open UDP ports in older version of Nmap (i.e., I believe a portion of Fyodor's results may be false positives).

Other differences that stuck out: Microsoft RPC- and SMB-related TCP ports 135, 139, and 445 are much less prevalent in my results. I believe this is likely due to improved perimeter filtering for these historically insecure services. HP printer-related ports are much more common in my results, likely due to increased deployment of multi-function printing devices with embedded web servers. Similarly, TCP port 1720, associated with H323 conferencing services, and TCP port 62078, iTunes sync, are much more frequently identified in my results. Overall, however, HTTP and HTTPS are by far the most common services, and historically unencrypted services such as Telnet and FTP remain widely used.