Some hints on automating Apache log analysis

Jul 15, '02 08:59:59AM

Contributed by: Anonymous

If you would like to know more about Apache server logs and setting up log analysis software, read the rest of the article.

Here are some things to consider as you look at Apache, logging, and analysis.

  1. If you are running OS X Server, turn off the performance cache if you want correct logging of the client IPs accessing your site.

  2. Decide if you want Apache to resolve IP addresses into domains or not (the default is off). If you have low volume web sites, this may be OK, but if you have a lot of traffic, it adds a lot of DNS overhead to have it on. Most log analysis programs or add-ons seem to be able to do the DNS lookups. The setting is in httpd.conf:
    HostnameLookups Off
  3. You may want to change the Apache log generation to output the "combined" format which gives you referring pages, and the browser type for each hit. To do this, change the following in your Apache config file:
    CustomLog "/private/var/log/httpd/access_log" "%h %l %u %t "%r" %>s %b"
    to
    CustomLog "/private/var/log/httpd/access_log" combined
    I'm running OS X Server, but I've moved most of my virtual host definitions out of the GUI generated httpd_macosxserver.conf so I can add things like:
    ##Separate cgi directories for each virtual host:
    ScriptAlias /cgi-bin /Library/WebServer/domain/cgi

    ##Capture all *.domain.com inside one virtualhost definition:
    ServerAlias *.domain.com domain.com
  4. Get a log analysis package. Two I found have been ported to OS X are Analog and Webalizer.

    I chose Analog because it handles multiple logs files in one pass, which works well with the log rotation system that OS X Server uses. BTW, in setting Analog up, I found that it has a bug in regard to long log filenames and wildcards (the work-around is to make sure the wildcard pattern is less than 15 or so characters). The author said he would try to fix this in the next version of Analog he releases. He also ported Report Magic, though it is still a Classic app.

  5. Set up an Analog config file for each of your web sites, pointing Analog at the respective log files and where to output the html report and respective chart images it creates.

  6. Decide if you want your log reports generated manually or automatically. If manually, just drag and drop the appropriate config file onto the Analog executable (this version is a Carbon application rather than a command line program).

  7. If you want automatic, periodic generation of the reports, you can set up a cron job to do this. Since I'm lazy, I got CronniX, which puts a GUI on top of the crontab files.

  8. But to make this work well, I wrote a little AppleScript to do the drag and drop of the config file onto the Analog application. If you have the command line version of Analog, you don't need this AppleScript.
    tell application "Finder"
    open the file "HardDrive:var:log:httpd:analog_domain.cfg" ¬
    using "HardDrive:Applications:Analog 5.24"
    end tell
    That's a continuation character ¬ at the end of the "open" line. Save this from the Script Editor as an Application, with Never Show Startup Screen checked. Test this script by double clicking on it.

  9. Drop the AppleScript application into Cronnix, which will add the necessary code for launching an application from the command line. Or if you are using the command line version of Analog, put in the appropriate command line such as:
    /usrlocal/pathToAnalog/analog -g/var/log/httpd/analog_domain.cfg
    Then modify the time parameters of the cron job to make it run when you want.
Since I have multiple sites, I wrote an AppleScript that checks a directory for Analog config files, and automatically loads Analog with each config file it finds. But that's another post.

Comments (3)


Mac OS X Hints
http://hints.macworld.com/article.php?story=20020715085959338