Skip to content

MAXakaWIZARD/JsonCollectionParser

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

68 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

JsonCollectionParser

Build Scrutinizer Code Quality Code Climate Coverage Status

GitHub tag Packagist Minimum PHP Version License

Event-based parser for large JSON collections (consumes small amount of memory). Built on top of JSON Streaming Parser

This package is compliant with PSR-4 and PSR-12 code styles and supports parsing of PSR-7 message interfaces. If you notice compliance oversights, please send a patch via pull request.

Installation

You will need Composer to install the package

composer require maxakawizard/json-collection-parser:~1.0

Input data format

Data must be in one of following formats:

Array of objects (valid JSON)

[
    {
        "id": 78,
        "title": "Title",
        "dealType": "sale",
        "propertyType": "townhouse",
        "properties": {
            "bedroomsCount": 6,
            "parking": "yes"
        },
        "photos": [
            "1.jpg",
            "2.jpg"
        ],
        "agents": [
            {
                "name": "Joe",
                "email": "joe@realestate.email"
            },
            {
                "name": "Sally",
                "email": "sally@realestate.email"
            }
         ]
    },
    {
        "id": 729,
        "dealType": "rent_long",
        "propertyType": "villa"
    },
    {
        "id": 5165,
        "dealType": "rent_short",
        "propertyType": "villa"
    }
]

Sequence of object literals:

{
    "id": 78,
    "dealType": "sale",
    "propertyType": "townhouse"
}
{
    "id": 729,
    "dealType": "rent_long",
    "propertyType": "villa"
}
{
    "id": 5165,
    "dealType": "rent_short",
    "propertyType": "villa"
}

Sequence of object and array literals:

[[{
    "id": 78,
    "dealType": "sale",
    "propertyType": "townhouse"
}]]
{
    "id": 729,
    "dealType": "rent_long",
    "propertyType": "villa"
}
[{
    "id": 5165,
    "dealType": "rent_short",
    "propertyType": "villa"
}]

Sequence of object and array literals (some of objects in subarrays, comma-separated):

[
{
    "id": 78,
    "dealType": "sale",
    "propertyType": "townhouse"
},
{
    "id": 729,
    "dealType": "rent_long",
    "propertyType": "villa"
}
]
{
    "id": 5165,
    "dealType": "rent_short",
    "propertyType": "villa"
}

Usage

Function as callback:

function processItem(array $item)
{
    is_array($item); //true
    print_r($item);
}

$parser = new \JsonCollectionParser\Parser();
$parser->parse('/path/to/file.json', 'processItem');

Closure as callback:

$items = [];

$parser = new \JsonCollectionParser\Parser();
$parser->parse('/path/to/file.json', function (array $item) use (&$items) {
    $items[] = $item;
});

Static method as callback:

class ItemProcessor {
    public static function process(array $item)
    {
        is_array($item); //true
        print_r($item);
    }
}

$parser = new \JsonCollectionParser\Parser();
$parser->parse('/path/to/file.json', ['ItemProcessor', 'process']);

Instance method as callback:

class ItemProcessor {
    public function process(array $item)
    {
        is_array($item); //true
        print_r($item);
    }
}

$parser = new \JsonCollectionParser\Parser();
$processor = new \ItemProcessor();
$parser->parse('/path/to/file.json', [$processor, 'process']);

Receive items as objects:

function processItem(\stdClass $item)
{
    is_array($item); //false
    is_object($item); //true
    print_r($item);
}

$parser = new \JsonCollectionParser\Parser();
$parser->parseAsObjects('/path/to/file.json', 'processItem');

Receive chunks of items as arrays:

function processChunk(array $chunk)
{
    is_array($chunk);    //true
    count($chunk) === 5; //true

    foreach ($chunk as $item) {
        is_array($item);  //true
        is_object($item); //false
        print_r($item);
    }
}

$parser = new \JsonCollectionParser\Parser();
$parser->chunk('/path/to/file.json', 'processChunk', 5);

Receive chunks of items as objects:

function processChunk(array $chunk)
{
    is_array($chunk);    //true
    count($chunk) === 5; //true

    foreach ($chunk as $item) {
        is_array($item);  //false
        is_object($item); //true
        print_r($item);
    }
}

$parser = new \JsonCollectionParser\Parser();
$parser->chunkAsObjects('/path/to/file.json', 'processChunk', 5);

Pass stream as parser input:

$stream = fopen('/path/to/file.json', 'r');

$parser = new \JsonCollectionParser\Parser();
$parser->parseAsObjects($stream, 'processItem');

Pass PSR-7 MessageInterface as parser input:

use Psr\Http\Message\MessageInterface;

/** @var MessageInterface $resource */
$resource = $httpClient->get('https://httpbin.org/get');

$parser = new \JsonCollectionParser\Parser();
$parser->parseAsObjects($resource, 'processItem');

Pass PSR-7 StreamInterface as parser input:

use Psr\Http\Message\MessageInterface;

/** @var MessageInterface $resource */
$resource = $httpClient->get('https://httpbin.org/get');

$parser = new \JsonCollectionParser\Parser();
$parser->parseAsObjects($resource->getBody(), 'processItem');

Supported formats

  • .json - raw JSON
  • .gz - GZIP-compressed JSON (you will need zlib PHP extension installed)

Supported sources

  • file
  • string
  • stream / resource
  • HTTP message interface PSR-7

Running tests

composer test

License

This library is released under MIT license.