Summary: AWStats use some improvements profiles and configuration instructions. Glad to see in AWStats 6.3 version: Chinese users basically only need to configure the file will LoadPlugin = "decodeutfkeys" Enable basically no Chinese search engine statistics, the current increase of more than one major domestic search engine manufacturers. Contains the major search engines for the domestic and spiders defined patches (after unpacking coverage lib \ directory under the directory to the original program), which also contains a sample configuration file for this site

Statistical system in the site log analysis of user behavior plays an important role, especially for key words from search engines to access statistics: it is a valid user behavior analysis of data sources. With years of Internet development, WEB log has become increasingly sophisticated statistical tools, features and more rich. Many of them are open source, AWStats is a very good one.

AWStats: Advanced Web Statistics

AWStats is a Sourceforge rapid development on the WEB, a Perl-based log analysis tool. Another very good compared to the open source log analysis tools to Webalizer , AWStats advantages:

  1. Friendly interface: the browser can directly call the appropriate language interface (a Simplified Chinese version)
    Reference sample output: ****chedong**/cgi-bin/awstats/
  2. Based on Perl: and a good solution to cross-platform issues, the system itself can run on GNU / Linux, or Windows (installed ActivePerl later); analysis of direct support for Apache log format (combined) and IIS format (need to be modified). Webalizer Although there are Windows platform version , but now the lack of maintenance;
    AWStats can achieve with a different system to complete the site on their own WEB server: GNU / Linux / Apache and Windows / IIS servers uniform statistics.
  3. More efficient: AWStats Webalizer rich output statistics project a lot more than the speed can still reach Webalizer 1 / 3, for an amount of one million visit the site, this rate is sufficient;
  4. Configuration / customization easy: The system provides enough flexibility, but also very reasonable default configuration rules, need to modify the default configuration does not exceed 3,4 entries can begin to run, and modify and extend the plug-in or more of;
  5. AWStats is a designer for the precise "Human visits" design, so access to many search engine robots have been filtered out, so it is likely that other statistical tools to log the figures are lower, the access from within the company IP filter settings can also be filtered out.
  6. Provides many extended parameter statistics functions: Use ExtraXXXX series configuration generates application-specific parameter analysis would be very useful for the product.

More and other tools: Webalizer, analog comparison, please refer to:
**awstats.sourceforge**/ # COMPARISON

AWStats Installation notes

AWStats mode of operation is this:

  1. Analysis of the log: This log after running the archive to a AWStats statistics database (plain text) Lane;
  2. Then the output: two forms
    • One is by reading the statistics database cgi program output;
    • One is run background scripts to export the output into a static document;

Here are two examples of log statistics for individual sites:
One is a GNU / Linux on the output by the CGI,
One is based on Windows 2000 export static pages

Download / Installation

**sourceforge**/projects/awstats/ download and install the package:

GNU / Linux: tar zxf awstats-version.tgz
awstats scripts and static files by default are in the wwwroot directory: cgi-bin directory of the files are deployed program to / home / apache / cgi-bin / awstats / under
mv awstats-version/wwwroot/cgi-bin / path / to / apache / cgi-bin / awstats
# Copy the icon file directory such as the HTML file published to the WEB directory: / home / apache / htdocs / update the next release of more scripts in batch tools directory, can be put cgi-bin/awstats / directory ,

Windows 2000: according to background the script mode, direct unpack, and then moved to D: \ AWStats directory icon next to the icon directory to the release of IIS directory: inetpub / icon

Data Source Log format and cut-off rule by day

  1. For Apache: Setting log format is good: you can set the combined format, the log truncation little trouble: the need to install cronolog tool, set the log truncation by day:
    CustomLog "| / usr / local / sbin / cronolog / path / to / apache / logs / access_log.% Y% m% d" combined
    For example: logs/access_log.20030326
    Log is compressed format, you can use gzip-d </ home / apache / logs / access_log.% YYYY-24% MM-24% DD-24.gz | dynamic decompression statistics.
  2. For IIS: the default has a better log truncation rules by day, but the IIS log format is not a good AWStats statistics,
    It is best to remove all the log fields directly, and in strict accordance with the following list of settings
    • Date date
    • Time to time
    • Client IP address c-ip
    • Cs-username username
    • Method cs-method
    • URI stem cs-uri-stem
    • Status sc-status agreement
    • Sc-bytes number of bytes sent
    • Protocol version cs-version
    • User Agent cs (User-Agent)
    • Reference cs (Referer)
    Compared to the IIS default settings:
    Reduction are:
    • Server IP Address
    • Server Port
    • URI Query
    Increase are:
    • The number of bytes sent
    • Protocol Version
    • Refer

Configuration files are named: awstats.sitename.conf

AWStats The main program will automatically call the appropriate site based on site name configuration file: awstats.sitename.conf
For example: Run. / = chedong call is the same directory awstats.chedong.conf configuration file;
If you do not specify the-config, will find the current directory awstats.conf or / etc / awstats.conf as the default configuration file.
It is best to rename the default awstats.model.conf into awstats.yoursite.conf; example: awstats.chedong.conf,

For multiple site statistics, AWStats configuration file contains the function is very useful, and we can put in a common configuration document, and then (later began to support version 5.4) Include configuration common configuration is contained in each specific configuration file the head, then covered with a common configuration other configurations of the corresponding attributes, such as:
Include = "common.conf"
LogFile = "/ path / to / bbs / access_log"
SiteName = "bbs.chedong**"

At least to modify the configuration file: LogFile SiteDomain LogFormat

For GNU / Linux, Apache log statistics only modify: LogFile SiteDomain these two options

  1. GNU / Linux LogFile = "/ path / to / apache / logs / access_log.% YYYY-24% MM-24% DD-24"
    Windows 2000 LogFile = "d: \ iis_logs \ W3SV3 \ ex% YY-24% MM-24% DD-24.log"
    This configuration means that with 24 hours before the year, month, date, spell out the log file name;
  2. SiteDomain = "**chedong**"
    Site name, default is empty, if empty, AWStats will refuse to run;
  3. For more statistics need to modify the IIS log a:
    LogFormat = 2
    The default value is 1: Apache log, IIS log 2

Other points to note:
AWStats does not filter the default swf file, will. Swf counted as PageView, so if the site is advertising on the main swf file, then best to filter out:

Log Analysis

. / = sitename-lang = cn
For example:. / = chedong
Will automatically call the awstats. Chedong. Conf configuration file

Statistics Output

GNU / Linux **localhost/cgi-bin/awstats/
Windows 2000 **localhost/awstats/awstats.chedong.html

Automatically log statistics

On GNU / Linux: crontab-e: run every day 8:10
# Update awstats
10 8 * * * (cd / path / to / apache / cgi-bin / awstats /;. / = chedong)

Windows 2000: Set to run every day 8:10
D: \ Perl \ bin \ perl.exe d: \ AWStats \ tools \ = chedong-lang = cn-dir = c: \ inetpub \ awstats \-awstatsprog = d: \ awstats \ wwwroot \ cgi-bin \

Multi-site log statistics

AWStats comes with a batch tools: tools /, you can bulk to traverse a directory and all files configured to run statistics. So the rest of the work on the main problem is the synchronization log.

For multiple sites, many configuration options are repeated, each profile changes if it will be very troublesome to maintain, AWStats version 5.4 offers from the configuration file contains the function, so we may have a general configuration, such as: common . conf

Then the configuration settings to other sites: The latter option can be inconsistent coverage and the default configuration.
Include "chedong**mon.conf"
LogFile "/ path / to / bbs_log"
SiteName "bbs.chedong**"

Include "chedong**mon.conf"
LogFile "/ path / to / www_log"
SiteName "**chedong**"
HostAliases = "chedong**"

Description of statistical indicators

  • Visitors: Unique visitors by the IP statistics, an IP on behalf of a visitor;
  • Number of visits: one visitor may visit several times within one day (for example: the morning and one in the afternoon), so a certain period of time (for example: 1 hour), not the number of duplicate IP statistics, the number of visitors access;
  • Pages: does not include images, CSS, JavaScript files, and so the total number of page views of pure, but if a page using multiple frames, each frame is considered a page request;
  • Number of files: from the browser requests the total number of client files, including images, CSS, JavaScript, etc., the user requests a page, if the page contains pictures, so the server will issue a Duoci file requests, file number is generally much larger than file number;
  • Byte: The data passed to the client the total flow;
  • From REFERER in the data: the log of the reference (REFERER) Field , visit the corresponding page before recorded address, so if the user is through search engine search results click into the site, a user will log in the appropriate search engine query address, this address can resolve the user query by using keywords extracted:
    For example:
    2003-03-26 15:43:58 - GET / index.html 200 192 HTTP/1.1 Mozilla/4.0 + (compatible; + MSIE +5.01; + Windows + NT +5.0) ****google ** / search? q = chedong
    AWStats key phrase in the search engine and keyword statistics function is more complete: 3 million around the world can identify a variety of machine Pachong, and can identify most of the major international search engines and many areas of local language search engine .

Hacking AWStats

GIS-based plug-ins installed:

GeoIP and Geo:: IPfree (awstats 5.5 +)
GeoIP and Geo:: IPfree have free national / IP-mapping table, than the reverse DNS resolution domain to get through the statistical accuracy, and speed. GeoIP API that is free, the default library is free, fee is updating its data services. Geo:: IPfree not only the code is open, but also the public library data.

GeoIP Installation:
First download the C library: GeoIP C unpacked
%. / Configure; make
# Make install

Then download the Perl libraries: GeoIP Perl unpacked
% Perl Makefile.PL; make
# Make install

Geo:: IPfree Installation:
Download Geo:: IPfree unpacked
% Perl Makefile.PL
% Make
# Make install

Configuration: the configuration file by opening GEOIP related plug-ins:

LoadPlugin = "geoip GEOIP_STANDARD / home / apache / chedong** / cgi-bin / awstats / GeoIP.dat"
LoadPlugin = "geoip_city_maxmind GEOIP_STANDARD / home / apache / chedong** / cgi-bin / awstats / GeoLiteCity.dat"

MaxMind GeoIP, and currently provides a free packet GeoIPCityLite: on a regular basis each month to download from the following address:

wget **geolite.maxmind**/download/geoip/database/GeoLiteCity.dat.gz
wget **geolite.maxmind**/download/geoip/database/GeoLiteCountry/GeoIP.dat.gz

From the statistical point of view is generally the most recent 3 month update, in addition to: **geolite.maxmind**/download/geoip/database/ CSV source files are also under offer; Also: the use of QQ can be more pure library detailed statistical distribution of geographic information ;

Author: car east on :2003 -04-09 16:04 Last updated :2009 -07-10 11:07
Copyright : can be reproduced, reprinted, please be sure to indicate the form of hyperlinks article the original source and author information and this statement .
