There's a simple contributed tool available to parse the Shibboleth 2.x IdP's audit log files and output a few statistics. While future releases of the IdP might come with that functionality out of the box, currently you'd either have to write such a thing yourself or use this script.
- Processes one or several audit log files, combining the input
- Transparently works with compressed logfiles (gzip, bzip2 supported) when used with Python 2.5 or higher
- Works as a filter on the command line
- Generates the following statistics from the log files it processes:
- List of unique Relying Parties (Service Provider EntityIds)
- Number of unique Relying Parties
- Number of unique UserIDs (Principals)
- Number of logins
- Number of events per Relying Party
- Number of events per Relying Party (sorted by number of events)
- Usage of SAML message profiles per Relying Rarty
There really is no installation (except for the download of the tool itself) since this tool is written in Python and can be used with both Python (somtimes also called CPython) or Jython, the Python implementation for the Java VM.
Note that running this under Python is approx. 20 times faster as compared to Jython, because of the startup overhead of the Java VM (but YMMV and startup speed may not matter for generating stats).
If you already have Python 2.4 or greater installed (as many GNU/Linux distributions will have) you don't need to do anything special, just download the tool, name it any way you like, and run it.
When used with Python 2.5 or greater support for parsing compressed logs is activated automatically. (To use compressed log files with older Python versions see the filter examples in the "Usage" section below). Since RHEL5/CentOS5 still only ships with Python 2.4.3 to be able to use this feature on these systems add the Extra Packages for Enterprise Linux (EPEL) repository, install the Python 2.6 intepreter (
yum install python26) and change the first line of
loganalysis.py so that it uses the newer Python interpreter instead:
If your Python interpreter is in your
$PATH (i.e. it can be found with just typing
python on the command line) you can make the script executable and skip calling the interpreter explicitly (you may need to change the first line to
#!/usr/bin/env python, though. Changing the first line to point to the interpreter obviously also works for cases where your
python executable is not in your
$PATH) or where it's not the version you want to use with this tool (see above):
If you also put the script (or a symlink pointing to it) in your
$PATH (or you setup an
alias in your
$SHELL) you can call the script just by name:
To alternatively run it within the Java VM (which is guaranteed to be there on a machine running the Shibboleth IdP) you first need to install Jython 2.5 or higher (which is just a
JAR file and a wrapper shell script). The Jython installer is pretty user friendly and works fine on both Graphical User Interfaces as well as in console mode (no GUI).
It is suggested to just perform a "Standard install" and use the provided scripts to start the Jython interpreter.
Compressed log files
You can mix compressed and uncompressed log files with the log analysis tool (when using Python 2.5 or higher). Only audit log files ending in
.bz2, but logback does not support automatic bzip2 compression) will be treated as compressed files, all other files are assumed to be uncompressed.
The tool accepts several options, just call it without any command line options or call it with the
--help (or just
It expects all log filenames as arguments on the command line. E.g.
Order of options as well as order of options vs. arguments does not matter, so you can supply the file name(s) first. Also you can supply several options at once, either seperately (as in
-c -l -u) or all thrown together:
If you specify
- (a single dash) instead of a file name the log file's content is read from
STDIN, so you can use it as a filter. E.g. in case you're using compressed audit log files (see IdPProdLogging) and don't have Python 2.5 or greater installed, you could uncompress them to
STDOUT and do the analysis in a filter:
(N.B. This does not actually uncompress your log files on disk, it only feeds them to the filter uncompressed.)
- The options
-mare probably most interesting as they both show who the most used Relying Parties (Service Providers) are. The latter also sorts this by SAML message profile usage, so you can easily see which Relying Parties are using SAML1 vs. SAML2 and how often.
- The option
--quiet) does not do anything by tself, but modifies the other options' behaviour: When used it strips away all explanatory strings and decorations from the output. So if you know exactly what you're looking for this option makes it easier to further process the results.