No commit activity in last 3 years
No release in over 3 years
Parse large JSON files as a stream and trigger events upon key matching.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
 Dependencies

Runtime

 Project Readme

Ruby json_stream_trigger Gem

Instead of parsing a huge JSON files and loading it into memory, this library will stream the bytes through json-stream and only creates a small buffer for objects whose JSONPath matches a pattern you specify. When the object is completed, the specified block will be called.

Install with gem "json_stream_trigger" in your Gemfile.

Example:

f = File.open('really_big_file.json')
stream = JsonStreamTrigger.new()

# Match each array item. If you wanted to whole array use $.data
stream.on('$.data[*]') do |json_string|
  import JSON.parse(json_string, :quirks_mode => true)
end

# Will match for $.any.sub[*].item.meta
stream.on('$..meta') do |json_string|
  save_meta JSON.parse(json_string, :quirks_mode => true)
end

# read in 1MB chunks
while chunk = f.read(1024)
  stream << chunk
end

The captured JSON strinb buffer will be passed to the block. Note, Ruby's JSON library expects JSON documents to be passed to it - not primatives - this is why :quirks_mode => true has been added.

Path Details

The JSONPaths are similar to XPath notation. $ is the root, single wild card keys can be done with $.*.version, or you can do muli-level wildcard with $.docs..name. More info on JSONPath

A few more examples:

{
  meta: {version: 0.1},
  docs: [
    {id: 1},
    {id: 2},
    {id: 3},
    {id: 4},
    {
      id: 5,
      user: {
        name: "Tyler"
      }
    }
  ]
}
on('$.docs[*].id') # triggers for id property of every item in docs array
on('$.docs') # returns full array of items
on('$.docs[*]') # triggers for each item in the array
on('$.docs[1].id') # returns value of ID 1
on('$.docs[*].*.name') # returns 'Tyler'
on('$..name') # matches any value who's key is 'name'

Tests

rake test