Stone Steps Inc.

Article ID: Q20041208-01

Q: Can Stone Steps Webalizer analyze custom Apache log files?

A:Yes. Starting with v2.1.10.8 Stone Steps Webalizer can be configured to work with Apache custom log files that follow these guidelines:

  • Individual log record values must be separated by a single space
  • If a log record  value may contain spaces or may be empty, it must be enclosed in quotes
  • The log record timestamp must not use format specification (e.g. %{format}t is not supported)

In the following example the user name field (%u) in the first custom log format specification is enclosed in quotes because user names may contain spaces. The URL stem field (%U) is quoted because Apache logs URL file paths decoded and URLs may contain spaces. The query string field (%q) in the second specification is quoted because it may be reported as an empty string. Numeric fields, on the other hand, such as request processing time (%D), do not need to be quoted.

LogFormat "%a \"%u\" %t \"%r\" %>s %b %D \"%{Referer}i\" \"%{User-Agent}i\"" custom_1
LogFormat "%a %l \"%u\" %t %m \"%U\" \"%q\" %p %>s %b %D \"%{Referer}i\" \"%{User-Agent}i\"" custom_2
CustomLog logs/access_log custom_2

In order to configure Stone Steps Webalizer to be able to work with custom log files, add the ApacheLogFormat configuration variable that contains the content of the active LogFormat directive, without the outer quotes, to the webalizer.conf file. For example, if the CustomLog parameter in httpd.conf is configured as shown in the example above, ApacheLogFormat would be configured as follows:

ApacheLogFormat %a %l \"%u\" %t %m "%U" \"%q\" %p %>s %b %D \"%{Referer}i\" \"%{User-Agent}i\"

It is important to understand that Apache log files do not contain log format information (unlike log files in W3C extended format) and switching log file format without renaming the current log file will result in a log file that contains log information in mixed formats. Such log files cannot be analyzed unless they are split onto multiple consistently-formatted log files.

Apache understands three custom log options for logging the amount of data sent to HTTP clients: %b, %B and %O. The first two log the size of the response excluding HTTP headers. The last one (%O) instructs Apache to include the size of HTTP response headers when reporting response sizes and, consequently, produces more accurate results. Note, however, that you need to enable mod_logio if you would like to use %O.

If your Apache server is configured to serve SSL/TLS requests, you need to make sure that log formats specified in the ssl.conf file are exactly the same as matching log formats specified in the httpd.conf file. Check that all TransferLog and CustomLog directives in httpd.conf and ssl.conf that share the same log file (e.g. CustomLog in httpd.conf and TransactLog in ssl.conf both, by default, point to the access_log file) use the same log format. 

If log formats specified in httpd.conf and ssl.conf for any shared log file are not the same, the resulting log file will contain log information in mixed formats and cannot be analyzed. We also recommend that you use the %p field (port number), as shown in the example above, to make it possible to distinguish HTTP and HTTPS requests.

Note that Stone Steps Webalizer will process the ApacheLogFormat configuration variable only when launched with the -F option as shown in this example:

$ webalizer -N 20 -p -F apache -n 127.0.0.1 -o reports logs/access_log