Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

msgpack - ValueError: 2369781118 exceeds max_bin_len(2147483647 #262

Open
RichardScottOZ opened this issue Jan 22, 2024 · 1 comment
Open

Comments

@RichardScottOZ
Copy link
Contributor

RichardScottOZ commented Jan 22, 2024

Attempted join on around 250K polygons to 900K points on a Localcluster - the other way around did work a couple of days ago - haven't tried a cold machine start or cluster restart as yet - I saw an error like this from 2021 for dask that someone said an update fixed for a large graph

[realised I was only using a smaller one - need to do 3M] - presumably can probably get this to work currently chopping off the extra 10% or so needed to stay under that long int boundary.

C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\client.py:3162: UserWarning: Sending large graph of size 2.21 GiB.
This may cause some slowdown.
Consider scattering data ahead of time and using futures.
  warnings.warn(
---------------------------------------------------------------------------
CancelledError                            Traceback (most recent call last)
File <timed exec>:1

File ~\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\dask\base.py:379, in DaskMethodsMixin.compute(self, **kwargs)
    355 def compute(self, **kwargs):
    356     """Compute this dask collection
    357 
    358     This turns a lazy Dask collection into its in-memory equivalent.
   (...)
    377     dask.compute
    378     """
--> 379     (result,) = compute(self, traverse=False, **kwargs)
    380     return result

File ~\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\dask\base.py:665, in compute(traverse, optimize_graph, scheduler, get, *args, **kwargs)
    662     postcomputes.append(x.__dask_postcompute__())
    664 with shorten_traceback():
--> 665     results = schedule(dsk, keys, **kwargs)
    667 return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])

File ~\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\client.py:2244, in Client._gather(self, futures, errors, direct, local_worker)
   2242     else:
   2243         raise exception.with_traceback(traceback)
-> 2244     raise exc
   2245 if errors == "skip":
   2246     bad_keys.add(key)

CancelledError: ('sjoin-38ea83416d6236ee710acd39e46f004c', 7)
2024-01-22 13:07:30,743 - distributed.core - ERROR - Exception while handling op register-client
Traceback (most recent call last):
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 969, in _handle_comm
    result = await result
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\scheduler.py", line 5602, in add_client
    await self.handle_stream(comm=comm, extra={"client": client})
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 1024, in handle_stream
    msgs = await comm.read()
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\tcp.py", line 248, in read
    msg = await from_frames(
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 78, in from_frames
    res = _from_frames()
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 61, in _from_frames
    return protocol.loads(
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\protocol\core.py", line 175, in loads
    return msgpack.loads(
  File "msgpack/_unpacker.pyx", line 194, in msgpack._cmsgpack.unpackb
ValueError: 2369813480 exceeds max_bin_len(2147483647)
Task exception was never retrieved
future: <Task finished name='Task-1911149' coro=<Server._handle_comm() done, defined at C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py:875> exception=ValueError('2369813480 exceeds max_bin_len(2147483647)')>
Traceback (most recent call last):
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 969, in _handle_comm
    result = await result
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\scheduler.py", line 5602, in add_client
    await self.handle_stream(comm=comm, extra={"client": client})
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 1024, in handle_stream
    msgs = await comm.read()
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\tcp.py", line 248, in read
    msg = await from_frames(
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 78, in from_frames
    res = _from_frames()
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 61, in _from_frames
    return protocol.loads(
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\protocol\core.py", line 175, in loads
    return msgpack.loads(
  File "msgpack/_unpacker.pyx", line 194, in msgpack._cmsgpack.unpackb
ValueError: 2369813480 exceeds max_bin_len(2147483647)
2024-01-22 13:13:07,709 - distributed.protocol.core - CRITICAL - Failed to deserialize
Traceback (most recent call last):
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\protocol\core.py", line 175, in loads
    return msgpack.loads(
  File "msgpack/_unpacker.pyx", line 194, in msgpack._cmsgpack.unpackb
ValueError: 2369781116 exceeds max_bin_len(2147483647)
2024-01-22 13:13:07,755 - distributed.core - ERROR - Exception while handling op register-client
Traceback (most recent call last):
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 969, in _handle_comm
    result = await result
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\scheduler.py", line 5602, in add_client
    await self.handle_stream(comm=comm, extra={"client": client})
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 1024, in handle_stream
    msgs = await comm.read()
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\tcp.py", line 248, in read
    msg = await from_frames(
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 78, in from_frames
    res = _from_frames()
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 61, in _from_frames
    return protocol.loads(
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\protocol\core.py", line 175, in loads
    return msgpack.loads(
  File "msgpack/_unpacker.pyx", line 194, in msgpack._cmsgpack.unpackb
ValueError: 2369781116 exceeds max_bin_len(2147483647)
Task exception was never retrieved
future: <Task finished name='Task-2018400' coro=<Server._handle_comm() done, defined at C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py:875> exception=ValueError('2369781116 exceeds max_bin_len(2147483647)')>
Traceback (most recent call last):
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 969, in _handle_comm
    result = await result
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\scheduler.py", line 5602, in add_client
    await self.handle_stream(comm=comm, extra={"client": client})
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 1024, in handle_stream
    msgs = await comm.read()
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\tcp.py", line 248, in read
    msg = await from_frames(
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 78, in from_frames
    res = _from_frames()
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 61, in _from_frames
    return protocol.loads(
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\protocol\core.py", line 175, in loads
    return msgpack.loads(
  File "msgpack/_unpacker.pyx", line 194, in msgpack._cmsgpack.unpackb
ValueError: 2369781116 exceeds max_bin_len(2147483647)
2024-01-22 13:17:51,917 - distributed.protocol.core - CRITICAL - Failed to deserialize
Traceback (most recent call last):
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\protocol\core.py", line 175, in loads
    return msgpack.loads(
  File "msgpack/_unpacker.pyx", line 194, in msgpack._cmsgpack.unpackb
ValueError: 2369781118 exceeds max_bin_len(2147483647)
2024-01-22 13:17:51,965 - distributed.core - ERROR - Exception while handling op register-client
Traceback (most recent call last):
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 969, in _handle_comm
    result = await result
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\scheduler.py", line 5602, in add_client
    await self.handle_stream(comm=comm, extra={"client": client})
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 1024, in handle_stream
    msgs = await comm.read()
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\tcp.py", line 248, in read
    msg = await from_frames(
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 78, in from_frames
    res = _from_frames()
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 61, in _from_frames
    return protocol.loads(
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\protocol\core.py", line 175, in loads
    return msgpack.loads(
  File "msgpack/_unpacker.pyx", line 194, in msgpack._cmsgpack.unpackb
ValueError: 2369781118 exceeds max_bin_len(2147483647)
Task exception was never retrieved
future: <Task finished name='Task-2222100' coro=<Server._handle_comm() done, defined at C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py:875> exception=ValueError('2369781118 exceeds max_bin_len(2147483647)')>
Traceback (most recent call last):
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 969, in _handle_comm
    result = await result
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\scheduler.py", line 5602, in add_client
    await self.handle_stream(comm=comm, extra={"client": client})
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\core.py", line 1024, in handle_stream
    msgs = await comm.read()
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\tcp.py", line 248, in read
    msg = await from_frames(
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 78, in from_frames
    res = _from_frames()
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\comm\utils.py", line 61, in _from_frames
    return protocol.loads(
  File "C:\Users\rscott\AppData\Local\miniconda3\envs\daskgeopandas\lib\site-packages\distributed\protocol\core.py", line 175, in loads
    return msgpack.loads(
  File "msgpack/_unpacker.pyx", line 194, in msgpack._cmsgpack.unpackb
ValueError: 2369781118 exceeds max_bin_len(2147483647)
@mrocklin
Copy link
Member

mrocklin commented Jan 22, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants