No release in over 3 years
Low commit activity in last 3 years
There's a lot of open issues
Fluentd Output filter plugin. It has designed to rewrite tag like mod_rewrite. Re-emmit a record with rewrited tag when a value matches/unmatches with the regular expression. Also you can change a tag from apache log by domain, status-code(ex. 500 error), user-agent, request-uri, regex-backreference and so on with regular expression.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Development

Runtime

 Project Readme

fluent-plugin-rewrite-tag-filter Build Status

Overview

Rewrite Tag Filter for Fluentd. It is designed to rewrite tags like mod_rewrite.
Re-emit the record with rewritten tag when a value matches/unmatches with a regular expression.
Also you can change a tag from Apache log by domain, status code (ex. 500 error),
user-agent, request-uri, regex-backreference and so on with regular expression.

This is an output plugin because fluentd's filter doesn't allow tag rewrite.

Requirements

fluent-plugin-rewrite-tag-filter Fluentd Ruby
>= 2.0.0 >= v0.14.2 >= 2.1
< 2.0.0 >= v0.12.0 >= 1.9

Installation

Install with gem or td-agent-gem command as:

# for system installed fluentd
$ gem install fluent-plugin-rewrite-tag-filter

# for td-agent2 (with fluentd v0.12)
$ sudo td-agent-gem install fluent-plugin-rewrite-tag-filter -v 1.6.0

# for td-agent3 (with fluentd v0.14)
$ sudo td-agent-gem install fluent-plugin-rewrite-tag-filter

For more details, see Plugin Management

Configuration

  • rewriterule<num> (string) (optional) <attribute> <regex_pattern> <new_tag>
    • Obsoleted: Use <rule> section
  • capitalize_regex_backreference (bool) (optional): Capitalize letter for every matched regex backreference. (ex: maps -> Maps) for more details, see usage.
    • Default value: no
  • remove_tag_prefix (string) (optional): Remove tag prefix for tag placeholder. (see the section of "Tag placeholder")
  • hostname_command (string) (optional): Override hostname command for placeholder. (see the section of "Tag placeholder")
    • Default value: hostname
  • emit_mode (enum) (required): Specify emit_mode to batch or record. batch will emit events per rewritten tag, and decrease IO. record will emit events per record.
    • Default value: batch

<rule> section (optional) (multiple)

  • key (string) (required): The field name to which the regular expression is applied
  • pattern (regexp) (required): The regular expression. /regexp/ is preferred because /regexp/ style can support character classes such as /[a-z]/. The pattern without slashes will cause errors if you use patterns start with character classes.
  • tag (string) (required): New tag
  • label (string) (optional): New label. If specified, label can be changed per-rule.
  • invert (bool) (optional): If true, rewrite tag when unmatch pattern
    • Default value: false

Usage

It's a sample to exclude some static file log before split tag by domain.

<source>
  @type tail
  path /var/log/httpd/access_log
  format apache2
  time_format %d/%b/%Y:%H:%M:%S %z
  tag td.apache.access
  pos_file /var/log/td-agent/apache_access.pos
</source>

# "capitalize_regex_backreference yes" affects converting every matched first letter of backreference to upper case. ex: maps -> Maps
# At 2nd <rule>, redirect to tag named "clear" which unmatched for status code 200.
# At 3rd <rule>, redirect to tag named "clear" which is not end with ".com"
# At 6th <rule>, "site.$2$1" to be "site.ExampleMail" by capitalize_regex_backreference option.
<match td.apache.access>
  @type rewrite_tag_filter
  capitalize_regex_backreference yes
  <rule>
    key     path
    pattern /\.(gif|jpe?g|png|pdf|zip)$/
    tag clear
  </rule>
  <rule>
    key     status
    pattern /^200$/
    tag     clear
    invert  true
  </rule>
  <rule>
    key     domain
    pattern /^.+\.com$/
    tag     clear
    invert  true
  </rule>
  <rule>
    key     domain
    pattern /^maps\.example\.com$/
    tag     site.ExampleMaps
  </rule>
  <rule>
    key     domain
    pattern /^news\.example\.com$/
    tag     site.ExampleNews
  </rule>
  <rule>
    key     domain
    pattern /^(mail)\.(example)\.com$/
    tag     site.$2$1
  </rule>
  # Note: Specify catch-all rule in the last block not to lost unmatched records
  <rule>
    key     domain
    pattern /.+/
    tag     site.unmatched
  </rule>
</match>

<match site.*>
  @type mongo
  host localhost
  database apache_access
  remove_tag_prefix site
  tag_mapped
  capped
  capped_size 100m
</match>

<match clear>
  @type null
</match>

Result

$ mongo
MongoDB shell version: 2.2.0
> use apache_access
switched to db apache_access
> show collections
ExampleMaps
ExampleNews
ExampleMail
unmatched

Debug

On starting td-agent, Logging supported like below.

$ tailf /var/log/td-agent/td-agent.log
2012-09-16 18:10:51 +0900: adding match pattern="td.apache.access" type="rewrite_tag_filter"
2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [1, "path", /\.(gif|jpe?g|png|pdf|zip)$/, "clear"]
2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [2, "domain", /^maps\.example\.com$/, "site.ExampleMaps"]
2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [3, "domain", /^news\.example\.com$/, "site.ExampleNews"]
2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [4, "domain", /^(mail)\.(example)\.com$/, "site.$2$1"]
2012-09-16 18:10:51 +0900: adding rewrite_tag_filter rule: [5, "domain", /.+/, "site.unmatched"]

Nested attributes

Dot notation:

<match kubernetes.**>
  @type rewrite_tag_filter
  <rule>
    key $.kubernetes.namespace_name
    pattern ^(.+)$
    tag $1.${tag}
  </rule>
</match>

Bracket notation:

<match kubernetes.**>
  @type rewrite_tag_filter
  <rule>
    key $['kubernetes']['namespace_name']
    pattern ^(.+)$
    tag $1.${tag}
  </rule>
</match>

These example configurations can process nested attributes like following:

{
  "kubernetes": {
    "namespace_name": "default"
  }
}

When original tag is kubernetes.var.log, this will be converted to default.kubernetes.var.log.

Tag placeholder

It is supported these placeholder for new_tag (rewritten tag).

  • ${tag}
  • __TAG__
  • ${tag_parts[n]}
  • __TAG_PARTS[n]__
  • ${hostname}
  • __HOSTNAME__

The placeholder of ${tag_parts[n]} and __TAG_PARTS[n]__ acts accessing the index which split the tag with "." (dot).
For example with td.apache.access tag, it will get td by ${tag_parts[0]} and apache by ${tag_parts[1]}.

Note Currently, range expression ${tag_parts[0..2]} is not supported.

Placeholder Options

  • remove_tag_prefix

This option adds removing tag prefix for ${tag} or __TAG__ in placeholder.

  • remove_tag_regexp

This option adds removing tag regexp for ${tag} or __TAG__ in placeholder.

  • hostname_command

By default, execute command as hostname to get full hostname.
On your needs, it could override hostname command using hostname_command option.
It comes short hostname with hostname_command hostname -s configuration specified.

Placeholder Usage

It's a sample to rewrite a tag with placeholder.

# It will get "rewritten.access.ExampleMail"
<match apache.access>
  @type rewrite_tag_filter
  remove_tag_prefix apache
  <rule>
    key     domain
    pattern ^(mail)\.(example)\.com$
    tag     rewritten.${tag}.$2$1
  </rule>
</match>

# It will get "rewritten.access.ExampleMail"
<match apache.access>
  @type rewrite_tag_filter
  remove_tag_regexp /^apache\./
  <rule>
    key     domain
    pattern ^(mail)\.(example)\.com$
    tag     rewritten.${tag}.$2$1
  </rule>
</match>

# It will get "http.access.log"
<match input.{apache,nginx}.access.log>
  @type rewrite_tag_filter
  remove_tag_regexp /^input\.(apache|nginx)\./
  <rule>
    key     domain
    pattern ^.+$
    tag     http.${tag}
  </rule>
</match>

# It will get "rewritten.ExampleMail.app30-124.foo.com" when hostname is "app30-124.foo.com"
<match apache.access>
  @type rewrite_tag_filter
  <rule>
    key     domain
    pattern ^(mail)\.(example)\.com$
    tag     rewritten.$2$1.${hostname}
  </rule>
</match>

# It will get "rewritten.ExampleMail.app30-124" when hostname is "app30-124.foo.com"
<match apache.access>
  @type rewrite_tag_filter
  hostname_command hostname -s
  <rule>
    key     domain
    pattern ^(mail)\.(example)\.com$
    tag     rewritten.$2$1.${hostname}
  </rule>
</match>

# It will get "rewritten.game.pool"
<match app.game.pool.activity>
  @type rewrite_tag_filter
  <rule>
    key     domain
    pattern ^.+$
    tag     rewritten.${tag_parts[1]}.${tag_parts[2]}
  </rule>
</match>

Altering Labels

In addition to changing tags, you can also change event's route by setting the label for the re-emitted event.

For example, given this configuration:

<match apache.access>
  @type rewrite_tag_filter
  <rule>
    key     domain
    pattern ^www\.example\.com$
    tag     web.${tag}
  </rule>
  <rule>
    key     domain
    pattern ^(.*)\.example\.com$
    tag     other.$1
    label   other
  </rule>
</match>

message: {"domain": "www.example.com"} will get its tag changed to web.apache.access, while message {"domain": "api.example.com"} will get its tag changed to other.api and be sent to label other

Example

Related Articles

TODO

Pull requests are very welcome!!

Copyright

Copyright : Copyright (c) 2012- Kentaro Yoshida (@yoshi_ken)
License : Apache License, Version 2.0