Skip to content

rusty-dev/ijson-filter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ijson-filter

Iterative JSON filtering tool based on ijson.

Loading, filtering and output is performed without loading the whole file into memory, allowing filtering of large JSON files.

Build Status: Build Status

Installation:

pip install ijson-filter

ijson is using YAJL2 library for fast parsing, so it's highly recommended that you install it as well. It will still work without YAJL2, but significantly slower.

Usage:

Usage: ijson-filter [OPTIONS] [INPUT]

  Streaming JSON filter.

Options:
  -o, --output FILENAME          Output filename, defaults to STDOUT.
  -f, --filter JSON_PATH_FILTER  Filter a JSON path, format:
                                 "PREFIX_PATH[=INT|~REGEX]" Examples: get last
                                 50 elements of data.rows - "data.rows=-50",
                                 get only data.rows and data.description keys
                                 - "data~(rows|description)"
  -v, --verbose                  Verbose output.
  --help                         Show this message and exit.

Examples:

data.json:

{
	"name": "Primary data set #1",
  "table": {
  	"description": "Users",
    "rows": [
    	{ "name": "User1", ... },
        ...
    ]
  }
}
  • Limit the number of items in rows field to 50 last items, of data.json file and output to STDOUT:
$ ijson-filter -f "table.rows=-50" input.json
  • Remove fields that contain a number in table object (using regular expressions) of data.json and output to filtered.json:
$ ijson-filter -f 'table~[^\d]+' data.json -o filtered.json
  • Filter output from unix commands and chain it to other commands (limit array to first 3 objects):
$ echo '[1,2,3,4,5]' | ijson-filter -f 3 | python -m json.tool
[
    1,
    2,
    3
]
  • It's possible to use multiple filters at once by specifying --filter parameter multiple times:
$ ijson-filter -f 'table~rows' -f 'table.rows=5' data.json > filtered.json