Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uwsgi kills connection when the geojson is huge #224

Open
PabloCastellano opened this issue May 14, 2015 · 6 comments
Open

uwsgi kills connection when the geojson is huge #224

PabloCastellano opened this issue May 14, 2015 · 6 comments

Comments

@PabloCastellano
Copy link
Member

I'm trying to import all the Guifi.net nodes (about 28k) in nodeshot.

Once the synchronizer has finished, none of the nodes are shown in the map.
Investigating a bit I have discovered that uwsgi sees the connection to the geojson is taking too much and kills it after 20 seconds.

You can see it failing here:
https://46.101.190.150/api/v1/layers/guifinet-world/nodes.geojson

Any possible workarounds?

  • Increase timeout (not really a solution)
  • Don't generate geojson on demand and serve a cached version.
  • Split the layer in smaller ones (in this case links between different layers may not show up)
  • Limit number of results: URL?limit=1000
@PabloCastellano
Copy link
Member Author

FWIW, here are some tests I did in a Digital Ocean instance (1GB RAM, 1 CPU) by running:

$ time wget --no-check-certificate https://46.101.190.150/api/v1/layers/guifinet-world/nodes.geojson?limit=X

Results per number of requests:

  • 1000: 0m2.272s
  • 5000: 0m9.881s
  • 8000: 0m14.627s
  • 10000: 0m16.530s
  • 11500: 0m19.790s
  • 12500: 0m21.444s (ERROR 502: Bad Gateway.)
  • 15000: 0m21.821s (ERROR 502: Bad Gateway.)

So the limit with these specs and default timeout is about 11500 nodes per layer.

@nemesifier
Copy link
Member

great work @PabloCastellano!

@G10h4ck
Copy link
Member

G10h4ck commented May 15, 2015

Seems caching is a good idea but I believe it is just a part of the solution ;)

@nemesifier
Copy link
Member

gonna try this directly on the server and see what happens:
https://github.com/gizmag/drf-ujson-renderer

@nemesifier
Copy link
Member

some other possible performance improvements were outlined here:
http://wiki.ninux.org/GSoCIdeas2015#Nodeshot:_performance_improvements

but I think the best thing would be to work on #116 and implement loading data incrementally through the websocket protocol, in theory it should be very fast!

@nemesifier nemesifier added this to the Beta milestone May 19, 2015
@nemesifier
Copy link
Member

It seems that this is a common problem.

I've found an RFC that deals with this problem exactly:
https://tools.ietf.org/html/rfc7464

   [...] when serializing a large sequence of
   values as an array, or a possibly indeterminate-length or never-
   ending sequence of values, JSON becomes difficult to work with.

   Consider a sequence of one million values, each possibly one kilobyte
   when encoded -- roughly one gigabyte.  It is often desirable to
   process such a dataset in an incremental manner without having to
   first read all of it before beginning to produce results.
   Traditionally, the way to do this with JSON is to use a "streaming"
   parser, but these are not widely available, widely used, or easy to
   use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants