ServerLogParser
ServerLogParser provides a high-level Ruby library for parsing apache server log files (common log format, with or without virtual hosts and combined log format) as used by Apache, Nginx and others.
It's a fork of ApacheLogRegex, which was in turn a port of Apache::LogRegex 1.4 Perl module. where much of the regex parts come from.
Installation
gem install server_log_parserUsage
Initialization
require 'server_log_parser'
parser = ServerLogParser::Parser.new(ServerLogParser::COMBINED_VIRTUAL_HOST)
# or:
# parser = ServerLogParser::Parser.new('%v %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"')Parsing
File.foreach('/var/log/apache/access.log') do |line|
parsed = parser.parse(line)
# {
# '%h' => '212.74.15.68',
# '%l' => '-',
# '%u' => '-',
# '%t' => '[23/Jan/2004:11:36:20 +0000]',
# '%r' => 'GET /images/previous.png HTTP/1.1',
# '%>s' => '200',
# '%b' => '2607',
# '%{Referer}i' => 'http://peterhi.dyndns.org/bandwidth/index.html',
# '%{User-Agent}i' => 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2) Gecko/20021202'
# }
endServerLogParser#parse will silently ignore errors, but if you'd prefer,
ServerLogParser#parse! will raise a ParseError exception.
Handling
File.foreach('/var/log/apache/access.log') do |line|
parsed = parser.handle(line)
# {
# '%h' => '212.74.15.68',
# '%l' => nil,
# '%u' => nil,
# '%t' => DateTime.new(2004, 1, 23, 11, 36, 20, '+0'),
# '%r' => {"method" => "GET", "resource" => "/images/previous.png", "protocol" => "HTTP/1.1"},
# '%>s' => 200,
# '%b' => 2607,
# '%{Referer}i' => 'http://peterhi.dyndns.org/bandwidth/index.html',
# '%{User-Agent}i' => 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2) Gecko/20021202'
# }
endApache log files use - to mean no data is present and these are replaced with nil,
like the %l and %u values above. Request is split into a nested hash.
The following fields are stored as Integer: %B, %b, %k, %p, %{format}p,
%P, %{format}P, %s, %>s, %I, %O.
The following fields are stored as Float: %D, %T.
The following fields are stored as DateTime: %t.
Note: %{format}t is stored as String currently.
The field %r is special, see above.
All other fields are stored as String.
ServerLogParser#handle will silently ignore errors, but if you'd prefer,
ServerLogParser#handle! will raise a ParseError exception.
Log Formats
The log format is specified using a rather verbose constant, which map out like:
| Name | Constant | Apache Format |
|---|---|---|
| Common Log Format | ServerLogParser::COMMON_LOG_FORMAT |
%h %l %u %t \"%r\" %>s %b |
| Common Log Format with virtual hosts | ServerLogParser::COMMON_LOG_FORMAT_VIRTUAL_HOST |
%v %h %l %u %t \"%r\" %>s %b |
| Combined | ServerLogParser::COMBINED |
%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\" |
| Combined with virtual hosts | ServerLogParser::COMBINDED_VIRTUAL_HOST |
%v %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\" |
Author
Alexander Kurakin <kuraga333@mail.ru>
Feedback and contribute
https://github.com/kuraga/server_log_parser/issues
License
MIT