Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large files break serial-json output #790

Open
kwalcock opened this issue Mar 11, 2023 · 0 comments
Open

Large files break serial-json output #790

kwalcock opened this issue Mar 11, 2023 · 0 comments

Comments

@kwalcock
Copy link
Member

They seem to fail in SerialJsonOutput.scala at

MentionsOps(mentions).json(pretty = true)

in which the entire output is converted into a single string. That string may be over 2GB in length. The exceptions thrown start complaining about negative numbers which are probably integers overflowing.

If that doesn't happen, then it is during the next f.writeString where it can't encode the large string to get it into the file. The error is "java.lang.OutOfMemoryError: Requested array size exceeds VM limit".

There may be a way to send formatted output directly and piecewise to a file without the intermediate string. That should fix the problem. The input file is only 560KB and there are some as large as 3.5 MB that need to be processed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant