JSON output support with --json/-J option #160

bonkf · 2019-11-22T14:40:48Z

This PR revamps #141 and fixes all the complaints. It adds support for outputting pretty-printed JSON with variable indentation (--json=n), outputting compact non-pretty-printed JSON (--json=0) and outputting the --vcount and --outage options' results in JSON format.

Unfortunately choosing a custom indentation or disabling pretty-printing with -J 4 or -J 0 does not work, because the parser can't tell the difference between an option's optarg and a non-option argument (e.g. an IP address).

Seems like the parser accepts -J4 or -J0.

coveralls · 2019-11-22T14:43:12Z

Coverage increased (+0.6%) to 80.131% when pulling 3923fa5 on Reperator:json-output into 12961d5 on schweikert:develop.

schweikert · 2019-12-06T21:52:02Z

Thanks for the patch! I will need some time to review it / integrate it, but I just wanted to already say that it's much appreciated :-)

schweikert · 2019-12-31T07:28:50Z

Considering that the JSON output is going to be an interface and we won‘t be able to change it later, we should make sure that the schema is how we want to keep it long term. Seeing the current output:

fping --json -n -A -c3 -p100 dns.google.com                                  
{
  "hosts": {
    "dns.google.com (2001:4860:4860::8844)": {
      "xmt": 3,
      "rcv": 3,
      "loss_percentage": 0,
      "min": 3.32,
      "avg": 3.32,
      "max": 3.33
    }
  }
}

I see a couple of issues:

The schema changes depending on the options that you use. If for example -C3 is used instead of -c3, then the values are the rtt measurements as array instead of the keys „xmt“, etc. We probably should use a consistent schema independently of what options are used, and just fill what is relevant.
Using -n and -A combined forces the user to parse the output still (to get the name and address). They probably should be keys instead.
„loss_percentage“ is quite long compared to the other keys. I also wonder whether this is needed at all (it is reduntant)
Maybe the key "hosts" could be renamed "summary" so that in the future also other output could be intergrated (the output that comes without -q, which --json currently implies)

How the output could look like:

{
  "summary": {
    "dns.google.com“: {
      "addr": "2001:4860:4860::8844",
      "xmt": 3,
      "rcv": 3,
      "min": 3.32,
      "avg": 3.32,
      "max": 3.33,
      "rtt": [ 3.32, 3.32, 3.33 ]
    }
  }
}

Let me know what you think. I can also work on making these changes.

bonkf · 2020-01-01T15:16:13Z

The schema changes depending on the options that you use. If for example -C3 is used instead of -c3, then the values are the rtt measurements as array instead of the keys „xmt“, etc. We probably should use a consistent schema independently of what options are used, and just fill what is relevant.

Good point. I think it best if we were to always include every possible field and null out everything that wouldn't be included in non-JSON output, e.g. something like

{
  "summary": {
    "dns.google.com": {
      "addr": "2001:4860:4860::8844",
      "host": "dns.google.com",
      "xmt": 3,
      "rcv": 3,
      "outage": null,
      "min": 3.32,
      "avg": 3.32,
      "max": 3.33,
      "rtt": null
    }
  }
}

or

{
  "summary": {
    "2001:4860:4860::8844": {
      "addr": "2001:4860:4860::8844",
      "host": null,
      "xmt": null,
      "rcv": null,
      "outage": null,
      "min": null,
      "avg": null,
      "max": null,
      "rtt": [
        3.32,
        3.32,
        3.33
      ]
    }
  }
}

OTOH this is kind of wasteful since we can easily do the summary calculations even if we print all rtts anyway. Opinions?

Using -n and -A combined forces the user to parse the output still (to get the name and address). They probably should be keys instead.

I'll add an addr key and a host key that include address and hostname respectively. I'll only include the hostname if the user provided it or if we've done rDNS resolution due to -n.

„loss_percentage“ is quite long compared to the other keys. I also wonder whether this is needed at all (it is reduntant)

Technically most keys are redundant as long as we include the rtts. I'll throw out return_percentage and loss_percentage and keep the others.

Maybe the key "hosts" could be renamed "summary" so that in the future also other output could be intergrated (the output that comes without -q, which --json currently implies)

Will do.

Let me know what you think. I can also work on making these changes.

No worries, I'll take care of it.

Edit: Do you want me to squash my commits?

Edit 2: Another thing I noticed is the issue of duplicate keys. Suppose a user runs fping -A -n -J 127.0.0.1 localhost or even simpler fping -A -n -J localhost localhost. Both summary entries would have the key localhost. JSON allows this, but I can imagine many parsers failing on this, especially if the JSON is supposed to be parsed as a programming language structure such as a hash table or an object. The problem is that identical keys may only become apparent after we have already done rDNS resolution. For now I'll just output duplicate keys; if this becomes a problem we may have to fail if we were about to output duplicate keys, or assign duplicate keys identifiers such as localhost (1), or instead of summary being an object we could make it an array. This would alleviate these problems but necessitates iterating over the array to find a specific entry (and we would still have entries that have identical addresses and hostnames).

jasoncdavis · 2020-01-03T15:37:15Z

Reperator, thanks for picking this back up! I'm encouraged to see others found value in the original suggestion. I've been using fping 4.2 with my original patch suggestions, but I look forward to your fix-ups and enhancements!

mmihalev · 2020-08-11T14:46:48Z

Any news about this? It will be really awesome to have json output!

bkuker · 2020-08-11T14:55:50Z

My two cents is that the schema needs work. The point about the schema changing is good, but in my opinion the "null" suggestion is not quite right. Json consuming code that has to distinguish between a field being null and a field not being present is almost always going to have bugs.

" We probably should use a consistent schema independently of what options are used, and just fill what is relevant." is a good approach, and allows for additions to the schema in the future while preserving backwards compatibility with any JSON consumer that is not going to be confused by new keys (Another good practice IMHO)

bonkf · 2020-08-12T06:33:58Z

I'm sorry, I completely forgot about this. I'll start working on it today; I hope I can get it finished soon-ish.

Json consuming code that has to distinguish between a field being null and a field not being present is almost always going to have bugs.

Agreed, I'll keep the schema consistent across options.

mmihalev · 2020-08-12T06:53:00Z

What about using numeric keys instead of IP address as a key? This should eliminate the issue with repeated IPs?

bonkf · 2020-08-12T07:11:49Z

In that case we can just use JSON arrays. My main gripe with that is that the API for JSON objects is much nicer than the API for JSON arrays in many languages (hashtables vs. lists).

mmihalev · 2020-08-12T07:28:12Z

In that case we can just use JSON arrays. My main gripe with that is that the API for JSON objects is much nicer than the API for JSON arrays in many languages (hashtables vs. lists).

I think this should not be an issue. Every JSON consumer can work with arrays with ease. And all the API's that I know about are using json arrays in their json responses.
P.S.: Maybe RTT's responses should also be array? If number of pings (-c) is greater than 1 we can include all the RTTs inside an array?

bkuker · 2020-08-12T11:35:05Z

Not a big deal, I am already running fping and parsing the string results into json and passing them further along to other software... This would apply to yaml or xml too, I do the conversion on more of a line by line basis and emit a stream of json documents.
In my case those documents have three different schemas, "progress" generated once per ping per IP, "result" generate once per IP, and "verbose result" which extends the result schema and has the verbose results, big surprise.

If this were of interest, https://en.wikipedia.org/wiki/JSON_streaming has some suggestions on outputting multiple json objects over a stream.

I can share an example as a different straw man to beat on.

//Ignore insonsistent numbers, I edited by hand for readability.
//Progress Results
{"ip": "8.8.8.8", "seq": 0, "size": 96, "rtt": 20.47, "rttAvg": 20.47, "loss": 0}
{"ip": "8.8.4.4", "seq": 0, "size": 96, "rtt": 23.73, "rttAvg": 23.73, "loss": 0}
{"ip": "8.8.8.8", "seq": 1, "size": 96, "rtt": 21.27, "rttAvg": 21.00, "loss": 0}
{"ip": "8.8.4.4", "seq": 1, "size": 96, "rtt": 24,00, "rttAvg": 23.53, "loss": 0}
{"ip": "8.8.4.4", "seq": 2, "size": 96, "rtt": 21.27, "rttAvg": 22.47, "loss": 0}
{"ip": "8.8.8.8", "seq": 2, "size": 96, "rtt": 22.73, "rttAvg": 21.27, "loss": 0}
{"ip": "8.8.8.8", "seq": 3, "size": 96, "rtt": 22.50, "rttAvg": 21.47, "loss": 0}
//lost #3 for 8.8.4.4
{"ip": "8.8.8.8", "seq": 4, "size": 96, "rtt": 22.73, "rttAvg": 22.00, "loss": 0}
{"ip": "8.8.4.4", "seq": 4, "size": 96, "rtt": 25.53, "rttAvg": 24.27, "loss": 0.2}

//Result (or verbose result. not both)
{"ip": "8.8.8.8", "sent": 5, "received": 5, "loss": 0, "rtt": {"min": 21.29, "avg": 22.29, "max": 23.89}}

//VerboseResult (note null in rtts array for lost packet)
{"ip": "8.8.4.4", "sent": 5, "received": 4, "loss": 0.20, "rtt": {"min": 21.45, "avg": 24.34, "max": 26.36}, "jitter": {"min": 0.266, "avg": 2.95, "max": 5.91}, "rtts": [23.43,null,21.17,26.76,25.26]}
{"ip": "8.8.8.8", "sent": 5, "received": 5, "loss": 0, "rtt": {"min": 20.62, "avg": 22.16, "max": 22.53}, "jitter": {"min": 0.1875, "avg": 0.949, "max": 1.719}, "rtts": [20.86,21.18,22.95,22.56,22.49]}

mmihalev · 2020-08-12T12:43:31Z

Not a big deal, I am already running fping and parsing the string results into json and passing them further along to other software... This would apply to yaml or xml too, I do the conversion on more of a line by line basis and emit a stream of json documents.
In my case those documents have three different schemas, "progress" generated once per ping per IP, "result" generate once per IP, and "verbose result" which extends the result schema and has the verbose results, big surprise.

If this were of interest, https://en.wikipedia.org/wiki/JSON_streaming has some suggestions on outputting multiple json objects over a stream.

I can share an example as a different straw man to beat on.
//Ignore insonsistent numbers, I edited by hand for readability.
//Progress Results
{"ip": "8.8.8.8", "seq": 0, "size": 96, "rtt": 20.47, "rttAvg": 20.47, "loss": 0}
{"ip": "8.8.4.4", "seq": 0, "size": 96, "rtt": 23.73, "rttAvg": 23.73, "loss": 0}
{"ip": "8.8.8.8", "seq": 1, "size": 96, "rtt": 21.27, "rttAvg": 21.00, "loss": 0}
{"ip": "8.8.4.4", "seq": 1, "size": 96, "rtt": 24,00, "rttAvg": 23.53, "loss": 0}
{"ip": "8.8.4.4", "seq": 2, "size": 96, "rtt": 21.27, "rttAvg": 22.47, "loss": 0}
{"ip": "8.8.8.8", "seq": 2, "size": 96, "rtt": 22.73, "rttAvg": 21.27, "loss": 0}
{"ip": "8.8.8.8", "seq": 3, "size": 96, "rtt": 22.50, "rttAvg": 21.47, "loss": 0}
//lost #3 for 8.8.4.4
{"ip": "8.8.8.8", "seq": 4, "size": 96, "rtt": 22.73, "rttAvg": 22.00, "loss": 0}
{"ip": "8.8.4.4", "seq": 4, "size": 96, "rtt": 25.53, "rttAvg": 24.27, "loss": 0.2}

//Result (or verbose result. not both)
{"ip": "8.8.8.8", "sent": 5, "received": 5, "loss": 0, "rtt": {"min": 21.29, "avg": 22.29, "max": 23.89}}

//VerboseResult (note null in rtts array for lost packet)
{"ip": "8.8.4.4", "sent": 5, "received": 4, "loss": 0.20, "rtt": {"min": 21.45, "avg": 24.34, "max": 26.36}, "jitter": {"min": 0.266, "avg": 2.95, "max": 5.91}, "rtts": [23.43,null,21.17,26.76,25.26]}
{"ip": "8.8.8.8", "sent": 5, "received": 5, "loss": 0, "rtt": {"min": 20.62, "avg": 22.16, "max": 22.53}, "jitter": {"min": 0.1875, "avg": 0.949, "max": 1.719}, "rtts": [20.86,21.18,22.95,22.56,22.49]}

I think showing the "progress" per IP per ping is a must! Either by adding this into the final response or outputting it while running. Right now the proposed implementation will imply the -q flag which is going to hide per-target/per-ping results

bkuker · 2020-08-12T13:02:40Z

IMHO though I could be over-complicating, flags would allow you to choose between multiple json document output like I suggested, or a single json output with all the results, possibly including progress.
This could be as simple as either one json object per line, or starting with a [ adding a , after each json object per line, and ending with ]:

[
{"ip": "8.8.8.8", "seq": 0, "size": 96, "rtt": 20.47, "rttAvg": 20.47, "loss": 0}
,
{"ip": "8.8.4.4", "seq": 0, "size": 96, "rtt": 23.73, "rttAvg": 23.73, "loss": 0}
,
{"ip": "8.8.8.8", "seq": 1, "size": 96, "rtt": 21.27, "rttAvg": 21.00, "loss": 0}
,
{"ip": "8.8.4.4", "seq": 1, "size": 96, "rtt": 24,00, "rttAvg": 23.53, "loss": 0}
,
..... output omitted
]

But it could be more complex:

{
  progress: [ ...array of progress... ],
  results: { ...hash of results... },
  options: { .. I love when programs output the options they were run with in structured output like this... }
}

fping --jsonStream | someCommandThatProcessesOneLineAtATimeWhereEachLineIsAValidJsonObject

fping --jsonArray > myValidJsonFile.json

bonkf · 2020-08-24T13:07:59Z

I'm a bit busy at work at the moment so it might take some time until I can get back to this PR, sorry. :(

dimir · 2020-09-25T11:15:18Z

Thank you, @Reperator, this is really useful feature for us, talking from Zabbix perspective! I'd agree with @bkuker that it makes more sense to not list tags that have no values.

OK here are my 2 cents. I have grouped the output in targets, within those there is data for each packet and summary. Also, I have added some ideas about error handling:

{
    "error": "in case of error",
    "targets": {
        "dns.google.com": {
            "addr": "8.8.8.8",
            "packets": [
                {
                    "bytes": 64,
                    "rtt": 3.29,
                    "avg": 3.35,
                    "loss": 11,
                    "source": "4.4.4.4"
                },
                {
                    "error": "timed out"
                }
            ],
            "summary": {
                "min": 21.29,
                "avg": 22.29,
                "max": 23.89
            }
        },
        "7.7.7.7": {
            "addr": "7.7.7.7",
            "packets": [
                {
                    "error": "timed out"
                },
                {
                    "error": "timed out"
                }
            ],
            "summary": {}
        }
    }
}

stefan-ahrefs · 2022-03-04T09:20:56Z

anyone still working on this or it was forgotten?

bonkf · 2022-03-04T10:26:45Z

I'm not working on it anymore, I left my old company a year ago and don't have the time to look into this again.

Fabian Bonk added 2 commits November 22, 2019 15:21

add JSON output support

b871e4d

add JSON output test

3923fa5

schweikert mentioned this pull request Jul 12, 2020

Add JSON output with new --json/-J option #141

Closed

schweikert force-pushed the develop branch from 84cfc08 to 6600b04 Compare July 26, 2020 06:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JSON output support with --json/-J option #160

JSON output support with --json/-J option #160

bonkf commented Nov 22, 2019 •

edited

coveralls commented Nov 22, 2019

schweikert commented Dec 6, 2019

schweikert commented Dec 31, 2019

bonkf commented Jan 1, 2020 •

edited

jasoncdavis commented Jan 3, 2020

mmihalev commented Aug 11, 2020

bkuker commented Aug 11, 2020

bonkf commented Aug 12, 2020

mmihalev commented Aug 12, 2020

bonkf commented Aug 12, 2020

mmihalev commented Aug 12, 2020

bkuker commented Aug 12, 2020 •

edited

mmihalev commented Aug 12, 2020 •

edited

bkuker commented Aug 12, 2020

bonkf commented Aug 24, 2020

dimir commented Sep 25, 2020 •

edited

stefan-ahrefs commented Mar 4, 2022

bonkf commented Mar 4, 2022

JSON output support with --json/-J option #160

Are you sure you want to change the base?

JSON output support with --json/-J option #160

Conversation

bonkf commented Nov 22, 2019 • edited

coveralls commented Nov 22, 2019

schweikert commented Dec 6, 2019

schweikert commented Dec 31, 2019

bonkf commented Jan 1, 2020 • edited

jasoncdavis commented Jan 3, 2020

mmihalev commented Aug 11, 2020

bkuker commented Aug 11, 2020

bonkf commented Aug 12, 2020

mmihalev commented Aug 12, 2020

bonkf commented Aug 12, 2020

mmihalev commented Aug 12, 2020

bkuker commented Aug 12, 2020 • edited

mmihalev commented Aug 12, 2020 • edited

bkuker commented Aug 12, 2020

bonkf commented Aug 24, 2020

dimir commented Sep 25, 2020 • edited

stefan-ahrefs commented Mar 4, 2022

bonkf commented Mar 4, 2022

bonkf commented Nov 22, 2019 •

edited

bonkf commented Jan 1, 2020 •

edited

bkuker commented Aug 12, 2020 •

edited

mmihalev commented Aug 12, 2020 •

edited

dimir commented Sep 25, 2020 •

edited