Low commit activity in last 3 years
A long-lived project that still receives updates
This is a Fluentd plugin to parse uri and query string in log messages.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
 Dependencies

Development

~> 13.0
~> 3.6

Runtime

>= 1.0, < 2
 Project Readme

fluent-plugin-uri-parser

Gem Version test

Fluentd filter plugins that decompose URIs and query strings into structured fields, so you can search, group, and aggregate on them in your downstream stack.

"https://example.com:8080/search?q=fluentd&lang=ja"
                                ↓
{ scheme: "https", host: "example.com", port: 8080,
  path: "/search", query: "q=fluentd&lang=ja", fragment: nil }

Why?

URL strings sitting in a single log field are a black box — you can't filter by host, group by path, or count by query parameter without parsing them first. These filters turn a raw URL into a record your storage can index.

Plugin Turns this Into this
uri_parser https://example.com:8080/p?q=1#x scheme, host, port, path, query, fragment
query_string_parser foo=bar&hoge=fuga { "foo" => "bar", "hoge" => "fuga" }

Requirements

fluent-plugin-uri-parser fluentd ruby
>= 0.4.0 >= v1.0.0 >= 3.2

Installation

gem install fluent-plugin-uri-parser

Or in your Gemfile:

gem "fluent-plugin-uri-parser"

uri_parser

Decomposes a URI field into its components.

Minimal example

<filter access.log>
  @type uri_parser
  key_name url
  out_key_scheme   scheme
  out_key_host     host
  out_key_port     port
  out_key_path     path
  out_key_query    query
  out_key_fragment fragment
</filter>
// input
{ "url": "https://example.com:8080/search?q=fluentd#top" }

// output
{
  "url":      "https://example.com:8080/search?q=fluentd#top",
  "scheme":   "https",
  "host":     "example.com",
  "port":     8080,
  "path":     "/search",
  "query":    "q=fluentd",
  "fragment": "top"
}

The port value uses Addressable::URI#inferred_port, so well-known schemes (http → 80, https → 443, ...) get a port even when the URL omits one.

Group output under a single key — hash_value_field

<filter access.log>
  @type uri_parser
  key_name url
  hash_value_field parsed
  out_key_host host
  out_key_path path
</filter>
// input
{ "url": "https://example.com/search" }

// output
{
  "url": "https://example.com/search",
  "parsed": { "host": "example.com", "path": "/search" }
}

Namespace output keys — inject_key_prefix

<filter access.log>
  @type uri_parser
  key_name url
  inject_key_prefix url.
  out_key_host host
  out_key_path path
</filter>
// input
{ "url": "https://example.com/search" }

// output
{
  "url":      "https://example.com/search",
  "url.host": "example.com",
  "url.path": "/search"
}

Drop empty components — ignore_nil

When a component is missing (no query, no fragment, etc.) the default is to emit it as null. Set ignore_nil true to omit those keys entirely.

// input
{ "url": "https://example.com/path" }

// ignore_nil false  (default)
{ "scheme": "https", "host": "example.com", "port": 443,
  "path": "/path", "query": null, "fragment": null }

// ignore_nil true
{ "scheme": "https", "host": "example.com", "port": 443,
  "path": "/path" }

query_string_parser

Decomposes a query string field into individual parameters.

Pairs with an empty key (e.g. the leading & in &foo=1) are silently dropped — they're noise from user-supplied URLs and never represent a real parameter.

Minimal example

<filter access.log>
  @type query_string_parser
  key_name query
</filter>
// input
{ "query": "foo=bar&hoge=fuga" }

// output
{ "query": "foo=bar&hoge=fuga", "foo": "bar", "hoge": "fuga" }

Group output under a single key — hash_value_field

<filter access.log>
  @type query_string_parser
  key_name query
  hash_value_field params
</filter>
// input
{ "query": "foo=bar&hoge=fuga" }

// output
{
  "query":  "foo=bar&hoge=fuga",
  "params": { "foo": "bar", "hoge": "fuga" }
}

Handle repeated parameters

A request like ?tag=ruby&tag=fluentd has two tag values. By default the last one wins (scalar). You have two ways to keep both:

Option A: always arrays — multi_value_params true

Every parameter becomes an array, even when it appeared once.

// input
{ "query": "tag=ruby&tag=fluentd&lang=ja" }

// output
{ "tag": ["ruby", "fluentd"], "lang": ["ja"] }

Option B: array only for listed names — multi_value_param_names

You know tag may repeat but lang won't. Keep lang as a scalar and only wrap tag in an array.

<filter access.log>
  @type query_string_parser
  key_name query
  multi_value_param_names tag
</filter>
// input
{ "query": "tag=ruby&tag=fluentd&lang=ja" }

// output
{ "tag": ["ruby", "fluentd"], "lang": "ja" }

When both are set, multi_value_params true wins.


Options

Shared between both filters unless noted.

Option Type Default What it does
key_name string (required) Record key holding the URL or query string to parse.
hash_value_field string nil If set, all extracted fields are nested under this key.
inject_key_prefix string nil Prefix prepended to every extracted key.
ignore_key_not_exist bool false When key_name is missing, drop the record instead of passing it through.
emit_invalid_record_to_error bool true When key_name is missing, emit the record to Fluentd's error stream.
suppress_parse_error_log bool false Silence the warning log when the value fails to parse.
ignore_nil bool false (uri_parser only) Omit output keys whose parsed value is nil.
out_key_scheme / out_key_host / out_key_port / out_key_path / out_key_query / out_key_fragment string nil (uri_parser only) Output key name for each URI component. Components without an out_key_* are not emitted.
multi_value_params bool false (query_string_parser only) Emit every parameter as an array.
multi_value_param_names array nil (query_string_parser only) Emit only the listed parameters as arrays. Ignored when multi_value_params is true.

Development

bundle install
bundle exec rake test

To install this gem onto your local machine: bundle exec rake install. To release: bump the version in the gemspec, then bundle exec rake release (tags, pushes, and uploads to rubygems.org).

Contributing

Bug reports and pull requests are welcome at https://github.com/daichirata/fluent-plugin-uri-parser.

License

Apache-2.0