Page tree
Skip to end of metadata
Go to start of metadata

There's a simple contributed tool available to parse the Shibboleth 2.x IdP's audit log files and output a few statistics. While future releases of the IdP might come with that functionality out of the box, currently you'd either have to write such a thing yourself or use this script.

Features

  • Processes one or several audit log files, combining the input
  • Transparently works with compressed logfiles (gzip, bzip2 supported) when used with Python 2.5 or higher
  • Works as a filter on the command line
  • Generates the following statistics from the log files it processes:
    • List of unique Relying Parties (Service Provider EntityIds)
    • Number of unique Relying Parties
    • Number of unique UserIDs (Principals)
    • Number of logins
    • Number of events per Relying Party
    • Number of events per Relying Party (sorted by number of events)
    • Usage of SAML message profiles per Relying Rarty

Setup

There really is no installation (except for the download of the tool itself) since this tool is written in Python and can be used with both Python (somtimes also called CPython) or Jython, the Python implementation for the Java VM.
Note that running this under Python is approx. 20 times faster as compared to Jython, because of the startup overhead of the Java VM (but YMMV and startup speed may not matter for generating stats).

Python

If you already have Python 2.4 or greater installed (as many GNU/Linux distributions will have) you don't need to do anything special, just download the tool, name it any way you like, and run it.

$ python /path/to/loganalysis.py

When used with Python 2.5 or greater support for parsing compressed logs is activated automatically. (To use compressed log files with older Python versions see the filter examples in the "Usage" section below). Since RHEL5/CentOS5 still only ships with Python 2.4.3 to be able to use this feature on these systems add the Extra Packages for Enterprise Linux (EPEL) repository, install the Python 2.6 intepreter (yum install python26) and change the first line of loganalysis.py so that it uses the newer Python interpreter instead:

-#!/usr/bin/python
+#!/usr/bin/python26

If your Python interpreter is in your $PATH (i.e. it can be found with just typing python on the command line) you can make the script executable and skip calling the interpreter explicitly (you may need to change the first line to #!/usr/bin/env python, though. Changing the first line to point to the interpreter obviously also works for cases where your python executable is not in your $PATH) or where it's not the version you want to use with this tool (see above):

$ chmod +x /path/to/loganalysis.py
$ /path/to/loganalysis.py

If you also put the script (or a symlink pointing to it) in your $PATH (or you setup an alias in your $SHELL) you can call the script just by name:

$ loganalysis.py

Java VM

To alternatively run it within the Java VM (which is guaranteed to be there on a machine running the Shibboleth IdP) you first need to install Jython 2.5 or higher (which is just a JAR file and a wrapper shell script). The Jython installer is pretty user friendly and works fine on both Graphical User Interfaces as well as in console mode (no GUI).
It is suggested to just perform a "Standard install" and use the provided scripts to start the Jython interpreter.

$ /path/to/jython/jython /path/to/loganalysis.py

Compressed log files

The logging system used by the IdP allows to automatically rotate and compress rotated log files, see this example.

You can mix compressed and uncompressed log files with the log analysis tool (when using Python 2.5 or higher). Only audit log files ending in .gz (or .bz2, but logback does not support automatic bzip2 compression) will be treated as compressed files, all other files are assumed to be uncompressed.

Usage

The tool accepts several options, just call it without any command line options or call it with the --help (or just -h):

Usage: loganalysis.py [options] [files ...]

Options:
  -h, --help            show this help message and exit
  -r, --relyingparties  list of unique relying parties, sorted by name
  -c, --rpcount         number of unique relying parties
  -u, --users           number of unique userids
  -l, --logins          number of logins
  -p, --rplogins        number of events per relying party, by name
  -n, --rploginssort    number of events per relying party, sorted numerically
  -m, --msgprofiles     usage of SAML message profiles per relying party
  -q, --quiet           suppress all descriptive or decorative output

It expects all log filenames as arguments on the command line. E.g.

$ loganalysis.py -n /opt/shibboleth-idp/logs/idp-audit.log

Order of options as well as order of options vs. arguments does not matter, so you can supply the file name(s) first. Also you can supply several options at once, either seperately (as in -c -l -u) or all thrown together:

$ loganalysis.py /opt/shibboleth-idp/logs/idp-audit.log -cul

If you specify - (a single dash) instead of a file name the log file's content is read from STDIN, so you can use it as a filter. E.g. in case you're using compressed audit log files (see IdPProdLogging) and don't have Python 2.5 or greater installed, you could uncompress them to STDOUT and do the analysis in a filter:

$ zcat /opt/shibboleth-idp/logs/idp-audit-2009-05*.gz | loganalysis.py -lr -

(N.B. This does not actually uncompress your log files on disk, it only feeds them to the filter uncompressed.)

Notes

  • The options -n and -m are probably most interesting as they both show who the most used Relying Parties (Service Providers) are. The latter also sorts this by SAML message profile usage, so you can easily see which Relying Parties are using SAML1 vs. SAML2 and how often.
  • The option -q (or --quiet) does not do anything by itself, but modifies the other options' behaviour: When used it strips away all explanatory strings and decorations from the output. So if you know exactly what you're looking for this option makes it easier to further process the results.
  • No labels