Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON is too deeply nested (SystemStackError) #27

Open
majormoses opened this issue Feb 7, 2017 · 27 comments
Open

JSON is too deeply nested (SystemStackError) #27

majormoses opened this issue Feb 7, 2017 · 27 comments

Comments

@majormoses
Copy link

I am having an issue and I can only reproduce it in my jenkins instance so I assume its something that is related to the system, ruby, and gems installed. Here is the relevant output from my jenkins job with some extra information such as ruby version and gems added for debug purposes.

https://gist.github.com/majormoses/721d610a1a11c0ffde9e1a1aa594cba1

Here is the simple json that was used to test:
https://gist.github.com/majormoses/e3245bf85ec2b1d1248ea2159f6b11f0

Any useful insight as where to look would be greatly appreciated.

@shortdudey123
Copy link
Contributor

This might be a bug on the OJ side. Two like it have been reported over the years:
ohler55/oj#50
ohler55/oj#69

@majormoses
Copy link
Author

@shortdudey123 ohler55/oj#333 (comment) where would you like to start?

@ohler55
Copy link

ohler55 commented Feb 10, 2017

The referenced OJ issue have nothing to do with this. They are old and not even for the same code used for the SAJ parser. Completely a red herring.

Can someone provide me with the handler used for the SAJ parser and the JSON being parsed? Is it the same short one in the gist? https://gist.github.com/majormoses/e3245bf85ec2b1d1248ea2159f6b11f0

@ohler55
Copy link

ohler55 commented Feb 10, 2017

Looks like you were faster on the comment. :-)

@ohler55
Copy link

ohler55 commented Feb 10, 2017

Here is a simple script I tried. It ran with no problems locally. Is it representative of the use in this issue?

require 'stringio'
require 'oj'
require 'jsonlint'

json = %|{
  "id": "babrams",
  "ssh_keys": [
    "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQDMCKMvMJ2yb13pHLsIPqi2xHOBKFZKa8+FM1FUqNbIxKeq3LLw+4WiXLK60DxxS6hmOnXD+FcWNaykkGLyGQeYHxsHynsXo9BPaG/ewaAp5SDU/zAIAaex15s/zvo5l+5Pq9OwXYtFRmfezk3ImCx7SZ8sMmHiFHYD8d38XBlGX53kLSFm5HLEopEvSCRTUyTj+tPIspgYR6IvCTdXnamO9FT8Rkeqw+mqjX9sVTaLuuqwQZlRFRMslrrJbSfv+7XvyKsjOsmAlkEYRlpHbUCxUh2Hc5q2Wfm+acOHPkkUPX8kLeT2vW+Bd/9LlPi9BN0dbmazGPbf5kv02MRNQNeUrdRfdzRIOG4tUEv154msF7QdEuy9W4pv9p0z2rNOqOQEw9HPhMiAkftIVGnvvGRj9+jIARIVzV5gAfVm2DQbPJClr0tGNCfzHmndt6FddawubXFPvFNrKgdC38Ts0Jzl1F3aWGHT8UyURDbezrTGpxg+Cqq4YUXIZfrrqB8nzF8qK3eMW2Tcxdy2m+fFBzQeHlozBSP55dcdjekdQrcVcwkYux4jecJ9BU++DjWtMtY93LgVL5BnHixS4ybo7loCndYkpsI6ZZm9oLVxHsjeoaM9D9iYoN28LIlALBm/dnfCh92G/H40v/X25DMIvRqcfnE31gsOCJ85A29twSC+Cw== babrams@system76.servalws",
    "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAACAQCy3cbPekJYHAIa8J1fOr2iIqpx/7pl4giJYAG7HCfsunRRUq3dY1KhVw1BlmMGIDzNwcuNVIfBS5HS/wREqbHMXxbwAjWrMwUofWd09CTuKJZiyTLUC5pSQWKDXZrefH/Fwd7s+YKk1s78b49zkyDcHSnKxjN+5veinzeU+vaUF9duFAJ9OsL7kTDEzOUU0zJdSdUV0hH1lnljnvk8kXHLFl9sKS3iM2LRqW4B6wOc2RbXUnx+jwNaBsq1zd73F2q3Ta7GXdtW/q4oDYl3s72oW4ySL6TZfpLCiv/7txHicZiY1eqc591CON0k/Rh7eR7XsphwkUstoUPQcBuLqQPA529zBigD7A8PBmeHISxL2qirWjR2+PrEGn1b0yu8IHHz9ZgliX83Q4WpjXvJ3REj2jfM8hiFRV3lA/ovjQrmLLV8WUAZ8updcLE5mbhZzIsC4U/HKIJS02zoggHGHZauClwwcdBtIJnJqtP803yKNPO2sDudTpvEi8GZ8n6jSXo/N8nBVId2LZa5YY/g/v5kH0akn+/E3jXhw4CICNW8yICpeJO8dGYMOp3Bs9/cRK8QYomXqgpoFlvkgzT2h4Ie6lyRgNv5QnUyAnW43O5FdBnPk/XZ3LA462VU3uOfr0AQtEJzPccpFC6OCFYWdGwZQA/r1EZQES0yRfJLpx+uZQ== babrams@babrams-Serval-WS"
  ],
  "htpasswd": "$1$i2xUX9a4$6LwYbCk4K6JErTDdaiZy50",
  "groups": [
    "devops"
  ],
  "shell": "\/bin\/bash",
  "comment": "Ben Abrams"
}|

linter = JsonLint::Linter.new
linter.check_stream(StringIO.new(json))

puts "errors: #{linter.errors}"

@dougbarth
Copy link
Owner

@majormoses Sorry to hear you're running into issues.

I tried reproducing this issue under OS X 10.12 with the same Ruby version, bundler version, and set of Gem you're using in your Gemfile, but the command completes successfully.

I also tried reproducing in a trusty64 virtual machine, but that also completed successfully.

It seems like some detail is missing here. I put a script in a gist that might help collect some forensics. Mind running that on your jenkins instance and sending the output over?

https://gist.github.com/dougbarth/84e6ccc3825d92aaad61fc2ab4e7fd59

Reproducing this in a clean environment (preferably in a VM) might be possible with guidance from that output.

@ohler55
Copy link

ohler55 commented Feb 20, 2017

What is the status on this? Anyone?

@majormoses
Copy link
Author

waiting for further troubleshooting advice...

@majormoses
Copy link
Author

@dougbarth ping any thoughts?

@majormoses
Copy link
Author

@dougbarth @ohler55 I tried messing again with this on our jenkins instance and after playing around I found something curious it only is re-produceable with Make (though I can not replicate locally sadly).
here is a gist that might maybe provide some more insight: https://gist.github.com/majormoses/a83a68426af45931adc0f5e0d466c305

@majormoses
Copy link
Author

so I think I found it! so it actually was the stacksize...after bumping this it worked I updated the comments on the gist to reflect the process...My guess is the reason I did not see this locally is that I have a newer version of make that likely more efficient?

@ohler55
Copy link

ohler55 commented Apr 14, 2017

Must be a very small stack size. The JSON is not very deep at all. Maybe the Jenkins VM or machine is small.

@majormoses
Copy link
Author

it was 8192 and bumped it to 16384

@majormoses
Copy link
Author

I still think there is a bug but its likely within that version of make itself.

@ohler55
Copy link

ohler55 commented Apr 14, 2017

It that in bytes? If so that is more like an embedded processor setting.

@majormoses
Copy link
Author

yes thats in bytes.

@ohler55
Copy link

ohler55 commented Apr 14, 2017

Wow, that is extreme. Glad you found the issue.

@majormoses
Copy link
Author

ya I dont like this "fix" at all, very few times should an application need such level of recursion...but hopefully we will be able to get onto a newer version of ubuntu soon and will see if we can replicate with a newer version of make. Do you think we should open new issues to do some profiling to see if we can turn anything up?

@ohler55
Copy link

ohler55 commented Apr 14, 2017

With the json you provided the stack depth is not deep at all but on a 64bit machine 8192 byte does not give you very many pointers and variables. A stack allocation of 4096 for a buffer uses up half the stack. 4096 is a single page. Usually the stack size is measured in MB.

@majormoses
Copy link
Author

@ohler55 sorry I misspoke that is in KB.

@majormoses
Copy link
Author

$ ulimit -a | grep stack
stack size              (kbytes, -s) 8192

@dougbarth
Copy link
Owner

Glad to hear you found a lead on the issue. I tried recreating the problem myself in a Docker container: https://gist.github.com/dougbarth/160de6ac13120103bfb1bd505901f6e1

Note: I'm not using the exact same Ruby version, but it's Ruby 2.2 and is using the same set of gems

At dramatically smaller stack sizes than you're using (failures start around 38KB), the program eventually fails with a SystemStackError, but I can't get it to fail at the call to Oj.saj_parse.

8192 seems to be the default stack size limit, so I'm not sure why you're running into this issue at that size.

It seems like this issue must be specific to something on your Jenkins server.

@majormoses
Copy link
Author

majormoses commented Apr 14, 2017

Ya it feels very wrong to me to increase it (at least in this case) but I can not find anything on the Jenkins node pointing to anything enlightening. I will try to spend some time next week if I can spare it to try digging deeper.

@dougbarth could you verify what version of make you tested with? So far that is the only thing that makes even a shred of sense to me is that there is a bug in the version of Make on our Jenkins node that effects this in a very odd way.

@dougbarth
Copy link
Owner

Looks to be the same version as your Jenkins server.

root@506320a3eb4e:/jsonlint# make --version
GNU Make 3.81
Copyright (C) 2006  Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.

This program built for x86_64-pc-linux-gnu

@majormoses
Copy link
Author

hmm that kills my theory...

@majormoses
Copy link
Author

So I think I figured out what was going on and I don't think it was caused directly by this gem. It has to do with how our Makefile was structured. I was able to reproduce it on travis as well so it was not longer a wonky jenkins in question. Basically every time you call a target from another target it spawns another make process. It looked something like this:

rubocop:
  # do my rubocop things
 foodcritic:
  # do my foodcritic things
chefjsonlint:
  # do my json linting of chef objects
ci: rubocop foodcritic chefjsonlint

My guess is that by the time it gets through spawning all those make processes makes it's stack size at that point too close to normal/sane limits before running it. When I have some more time later I will try to profile it to better understand how much is being used by what.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants