No commit activity in last 3 years
No release in over 3 years
ServerLogParser provides a high-level Ruby library for parsing server server log files (common log format, with or without virtual hosts and combined log format) as used by Apache, Nginx and others.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
 Dependencies

Development

~> 1.10
~> 5.8
~> 10.5
 Project Readme

ServerLogParser

ServerLogParser provides a high-level Ruby library for parsing apache server log files (common log format, with or without virtual hosts and combined log format) as used by Apache, Nginx and others.

It's a fork of ApacheLogRegex, which was in turn a port of Apache::LogRegex 1.4 Perl module. where much of the regex parts come from.

Installation

gem install server_log_parser

Usage

Initialization

require 'server_log_parser'

parser = ServerLogParser::Parser.new(ServerLogParser::COMBINED_VIRTUAL_HOST)
# or:
# parser = ServerLogParser::Parser.new('%v %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"')

Parsing

File.foreach('/var/log/apache/access.log') do |line|
  parsed = parser.parse(line)
  # {
  #   '%h'  => '212.74.15.68',
  #   '%l'  => '-',
  #   '%u'  => '-',
  #   '%t'  => '[23/Jan/2004:11:36:20 +0000]',
  #   '%r'  => 'GET /images/previous.png HTTP/1.1',
  #   '%>s' => '200',
  #   '%b'  => '2607',
  #   '%{Referer}i'     => 'http://peterhi.dyndns.org/bandwidth/index.html',
  #   '%{User-Agent}i'  => 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2) Gecko/20021202'
  # }
end

ServerLogParser#parse will silently ignore errors, but if you'd prefer, ServerLogParser#parse! will raise a ParseError exception.

Handling

File.foreach('/var/log/apache/access.log') do |line|
  parsed = parser.handle(line)
  # {
  #   '%h'  => '212.74.15.68',
  #   '%l'  => nil,
  #   '%u'  => nil,
  #   '%t'  => DateTime.new(2004, 1, 23, 11, 36, 20, '+0'),
  #   '%r'  => {"method" => "GET", "resource" => "/images/previous.png", "protocol" => "HTTP/1.1"},
  #   '%>s' => 200,
  #   '%b'  => 2607,
  #   '%{Referer}i'     => 'http://peterhi.dyndns.org/bandwidth/index.html',
  #   '%{User-Agent}i'  => 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2) Gecko/20021202'
  # }
end

Apache log files use - to mean no data is present and these are replaced with nil, like the %l and %u values above. Request is split into a nested hash.

The following fields are stored as Integer: %B, %b, %k, %p, %{format}p, %P, %{format}P, %s, %>s, %I, %O.

The following fields are stored as Float: %D, %T.

The following fields are stored as DateTime: %t. Note: %{format}t is stored as String currently.

The field %r is special, see above.

All other fields are stored as String.

ServerLogParser#handle will silently ignore errors, but if you'd prefer, ServerLogParser#handle! will raise a ParseError exception.

Log Formats

The log format is specified using a rather verbose constant, which map out like:

Name Constant Apache Format
Common Log Format ServerLogParser::COMMON_LOG_FORMAT %h %l %u %t \"%r\" %>s %b
Common Log Format with virtual hosts ServerLogParser::COMMON_LOG_FORMAT_VIRTUAL_HOST %v %h %l %u %t \"%r\" %>s %b
Combined ServerLogParser::COMBINED %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"
Combined with virtual hosts ServerLogParser::COMBINDED_VIRTUAL_HOST %v %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"

Author

Alexander Kurakin <kuraga333@mail.ru>

Feedback and contribute

https://github.com/kuraga/server_log_parser/issues

License

MIT