Wordpress Versioning: Part 1

Posted by Bagpuss on February 28, 2011
Tags: honeynet, digital forensics, data visualisation, wordpress

During a recent attempt at answering the Honeynet Log Mysteries Challenge, I wrote a series of reasoned analyses for the supplied Honeynet logging data. Unfortunately, teaching workloads stopped me from submitting any realistic challenge answer.

Inspired by the idea of applying the Scientific Method to Digital Forensics (see Casey2009 and Carrier2006) and using data visualisation (see Conti2007 and Marty2008), I set about attempting to apply the same principles to analysing the Log Mysteries data sets.

In this article, we shall only consider logging events present within sanitized_log/apache2/www-*.log. Extracting all the URL's from our Apache2 logging events allows us to build the following path-prefix tree (click to view this graph):

From this URL path-prefix tree, we may make the following observations:

URL PathVolume of all TrafficObservation
/feed/58.0%RSS news feed(?)
/wp-content/32.2%Wordpress(?)
/wp-content/plugins/10.2%Wordpress plugins installed(?): Contact Form 7; Google Analyticator; and Google Syntax Highlighter
/wp-includes/1.1%Wordpress(?)

In reviewing the Wordpress tagged URL's, we notice that a number of them have what looks like a version parameter (ie. ver). It is tempting to think that this observation might reveal the version of Wordpress and its associated plugins. By tagging all URL's (containing a ver parameter) we can analyse the association between each URL and the assumed version number with the following sunburst graph (click to view this graph):

From this graph, we can make the following observations:

Wordpress ComponentVersion
JQuery1.3.2(?)
JQuery Form2.02(?)
Google Analyticator6.0.2(?)
Contact Form 72.1.1(?)
Wordpress2.9.2(?)

Beyond demonstrating an association between the version parameter and a number of URLs, we have yet to prove anything here (ie. we have no rigorous arguments to back our suspicion that the above version numbers are correct).

When searching the Wordpress web site, we can often derive unique looking download candidates for each Wordpress plugin. However, for the Google Syntax Highlighter plugin, we discover that there are multiple candidates available for download (eg. Google Syntax Highlighter and Easy Google Syntax Highlighter). How can we determine which one was installed?

Our Apache2 logging events reveal that for each relevant URL, we have both the URL path and the response body size available for matching purposes. By downloading each candidate plugin archive, we should then be able to determine which plugin is the better match for the available matching data. For version 1.5.1 of the Google Syntax Highlighter plugin we can generate the following match comparison graph (click to view image):

And for version 1.2.1 of the Easy Google Syntax Highlighter plugin we can generate the following match comparison graph (click to view image):

Inspection of these signature graphs clearly demonstrates that version 1.5.1 of the Google Syntax Highlighter plugin (released: 14/08/2007) is a better match than version 1.2.1 of the Easy Google Syntax Highlighter plugin (released: 22/10/2009). However, we have still to rigorously demonstrate that there are no other equally consistent alternatives!

In our next blog article we shall look at how we can accurately estimate a version for Wordpress and its plugins using statistics.

Tools Used

Rails 3 used to model our data (see GitHub project for Rails application used in analysis)
JGR to initially explore and visualise data
Protovis 3.2 used to plot graphs in Rails application.

Comments


Bagpuss said on Wednesday, March 23, 2011:

It is also worth noting that we can not trust the contents of the ver parameter – as these values are set by the client and so can easily be forged.