Currently, I do web-analytics via awstats, which is a perl application that includes a cgi script. The cgi script has had multiple vulnerabilities, so I protect it behind apache auth.
I run awstats every 2 hours in cron with the following entry (see also: ScheduledTasks):
05 0,2,4,6,8,10,12,14,16,18,20,22 * * * /root/cronjobs/awstats.sh > /dev/null 2>&1
This runs awstats 5 minutes past the hour every two hours. The script it runs looks like this:
/usr/lib/cgi-bin/awstats.pl -update -config=dev.jmoiron.net /usr/lib/cgi-bin/awstats.pl -update -config=django.jmoiron.net /usr/lib/cgi-bin/awstats.pl -update -config=jmoiron.net /usr/lib/cgi-bin/awstats.pl -update -config=arsjerm.net #/usr/lib/cgi-bin/awstats.pl -update -config=ladykan.com /usr/lib/cgi-bin/awstats.pl -update -config=arlongpark.net
The config argument maps to configuration found in /etc/awstats/awstats.(confname).conf. Here's some relevant customization found in the /etc/awstats/awstats.dev.jmoiron.net.conf file:
LogFile="/var/log/apache2/jmoiron.net/dev.access.log" LogType=W LogFormat=1 SiteDomain="dev.jmoiron.net" HostAliases="localhost 127.0.0.1" DNSLookup=1
Finally, to get the web view working, I disable my default (mod_python) handler on /cgi-bin/ and /awstats_icon/, and set the following aliases on the apache site conf for where I'm running it:
ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/
<Directory "/usr/lib/cgi-bin">
AllowOverride None
Options ExecCGI -MultiViews +SymLinksIfOwnerMatch
Order allow,deny
Allow from all
AuthType basic
AuthName "cgi-bin restricted"
AuthUserFile %%HTPASSWD_FILE%%
<Files awstats.pl>
Require valid-user
</Files>
</Directory>
Alias /awstats-icon/ /usr/share/awstats/icon/
<Directory /usr/share/awstats/icon>
Options None
AllowOverride None
Order allow,deny
Allow from all
</Directory>The awstats debian package places awstats.pl cgi in /usr/lib/cgi-bin/ for you.
Awstats keeps its data in flat text files in /var/lib/awstats/ in the format awstatsMMYYY.(confname).txt. If you want to keep your (old!) stats around, you can archive these. Although I am paranoid about keeping the full logs around to re-run any future superior analytics programs, realistically anything that overtakes awstats will be able to import from awstats data files as well.
.