Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Ray DAG mode access Mars WEB Dashboard error #3171

Open
chaokunyang opened this issue Jun 27, 2022 · 0 comments
Open

[BUG] Ray DAG mode access Mars WEB Dashboard error #3171

chaokunyang opened this issue Jun 27, 2022 · 0 comments
Labels

Comments

@chaokunyang
Copy link
Contributor

chaokunyang commented Jun 27, 2022

Describe the bug
When access task status for ray DAG mode in mars dashboard, got incorrect task status. Following task is finished, the graph should be green instead of blank:
image
image

To Reproduce
To help us reproducing this bug, please provide information below:

  1. Your Python version: 3.8
  2. The version of Mars you use: [ray] Support scheduling ray tasks in Ray oscar deploy backend #3165
  3. Versions of crucial packages, such as numpy, scipy and pandas: pandas 1.4.2, numpy 1.19.5, scipy 1.8.1
  4. Full stack of the error.
(RayMainPool pid=54669) 2022-06-27 16:06:15,283 ERROR web.py:2239 -- 500 GET /api/session/SFJWHJesbcMFjANMcqTskZ6R/task/iQvlLWn2zC9z63HSWWbw5maQ/tileable_detail (127.0.0.1) 5.22ms
^C(RayMainPool pid=54669) 2022-06-27 16:06:16,283       ERROR core.py:82 -- TypeError when handling request with TaskWebAPIHandler.get_tileable_details
(RayMainPool pid=54669) Traceback (most recent call last):
(RayMainPool pid=54669)   File "/Users/chaokunyang/Desktop/chaokun/python/mars/mars/services/web/core.py", line 70, in wrapped
(RayMainPool pid=54669)     res = await self._create_or_get_url_future(
(RayMainPool pid=54669)   File "/Users/chaokunyang/Desktop/chaokun/python/mars/mars/services/task/api/web.py", line 132, in get_tileable_details
(RayMainPool pid=54669)     res = await oscar_api.get_tileable_details(task_id)
(RayMainPool pid=54669)   File "/Users/chaokunyang/Desktop/chaokun/python/mars/mars/services/task/api/oscar.py", line 77, in get_tileable_details
(RayMainPool pid=54669)     return await self._task_manager_ref.get_tileable_details(task_id)
(RayMainPool pid=54669)   File "mars/oscar/core.pyx", line 263, in __pyx_actor_method_wrapper
(RayMainPool pid=54669)     async with lock:
(RayMainPool pid=54669)   File "mars/oscar/core.pyx", line 266, in mars.oscar.core.__pyx_actor_method_wrapper
(RayMainPool pid=54669)     result = await result
(RayMainPool pid=54669)   File "/Users/chaokunyang/Desktop/chaokun/python/mars/mars/services/task/supervisor/manager.py", line 206, in get_tileable_details
(RayMainPool pid=54669)     return await processor_ref.get_tileable_details()
(RayMainPool pid=54669)   File "/Users/chaokunyang/Desktop/chaokun/python/mars/mars/oscar/backends/context.py", line 196, in send
(RayMainPool pid=54669)     return self._process_result_message(result)
(RayMainPool pid=54669)   File "/Users/chaokunyang/Desktop/chaokun/python/mars/mars/oscar/backends/context.py", line 76, in _process_result_message
(RayMainPool pid=54669)     raise message.as_instanceof_cause()
(RayMainPool pid=54669)   File "/Users/chaokunyang/Desktop/chaokun/python/mars/mars/oscar/backends/pool.py", line 586, in send
(RayMainPool pid=54669)     result = await self._run_coro(message.message_id, coro)
(RayMainPool pid=54669)   File "/Users/chaokunyang/Desktop/chaokun/python/mars/mars/oscar/backends/pool.py", line 343, in _run_coro
(RayMainPool pid=54669)     return await coro
(RayMainPool pid=54669)   File "/Users/chaokunyang/Desktop/chaokun/python/mars/mars/oscar/api.py", line 120, in __on_receive__
(RayMainPool pid=54669)     return await super().__on_receive__(message)
(RayMainPool pid=54669)   File "mars/oscar/core.pyx", line 523, in __on_receive__
(RayMainPool pid=54669)     raise ex
(RayMainPool pid=54669)   File "mars/oscar/core.pyx", line 516, in mars.oscar.core._BaseActor.__on_receive__
(RayMainPool pid=54669)     return await self._handle_actor_result(result)
(RayMainPool pid=54669)   File "mars/oscar/core.pyx", line 401, in _handle_actor_result
(RayMainPool pid=54669)     task_result = await coros[0]
(RayMainPool pid=54669)   File "mars/oscar/core.pyx", line 444, in mars.oscar.core._BaseActor._run_actor_async_generator
(RayMainPool pid=54669)     async with self._lock:
(RayMainPool pid=54669)   File "mars/oscar/core.pyx", line 445, in mars.oscar.core._BaseActor._run_actor_async_generator
(RayMainPool pid=54669)     with debug_async_timeout('actor_lock_timeout',
(RayMainPool pid=54669)   File "mars/oscar/core.pyx", line 450, in mars.oscar.core._BaseActor._run_actor_async_generator
(RayMainPool pid=54669)     res = await gen.athrow(*res)
(RayMainPool pid=54669)   File "/Users/chaokunyang/Desktop/chaokun/python/mars/mars/services/task/supervisor/task.py", line 159, in get_tileable_details
(RayMainPool pid=54669)     tileable_to_details = yield asyncio.to_thread(self._get_tileable_infos)
(RayMainPool pid=54669)   File "mars/oscar/core.pyx", line 455, in mars.oscar.core._BaseActor._run_actor_async_generator
(RayMainPool pid=54669)     res = await self._handle_actor_result(res)
(RayMainPool pid=54669)   File "mars/oscar/core.pyx", line 375, in _handle_actor_result
(RayMainPool pid=54669)     result = await result
(RayMainPool pid=54669)   File "/Users/chaokunyang/Desktop/chaokun/python/mars/mars/lib/aio/_threads.py", line 36, in to_thread
(RayMainPool pid=54669)     return await loop.run_in_executor(None, func_call)
(RayMainPool pid=54669)   File "/Users/chaokunyang/anaconda3/envs/mars3.8/lib/python3.8/concurrent/futures/thread.py", line 57, in run
(RayMainPool pid=54669)     result = self.fn(*self.args, **self.kwargs)
(RayMainPool pid=54669)   File "/Users/chaokunyang/Desktop/chaokun/python/mars/mars/services/task/supervisor/task.py", line 100, in _get_tileable_infos
(RayMainPool pid=54669)     subtask_id_to_results = self._get_all_subtask_results()
(RayMainPool pid=54669)   File "/Users/chaokunyang/Desktop/chaokun/python/mars/mars/services/task/supervisor/task.py", line 64, in _get_all_subtask_results
(RayMainPool pid=54669)     for stage in processor.stage_processors:
(RayMainPool pid=54669) TypeError: [address=ray://ray-cluster-1656317089/0/0, pid=54669] 'NoneType' object is not iterable
(RayMainPool pid=54669) 2022-06-27 16:06:16,289 ERROR web.py:2239 -- 500 GET /api/session/SFJWHJesbcMFjANMcqTskZ6R/task/iQvlLWn2zC9z63HSWWbw5maQ/tileable_detail (127.0.0.1) 8.59ms
  1. Minimized code to reproduce the error.
    pytest -v -s mars/deploy/oscar/tests/test_ray_dag_oscar.py::test_iterative_tiling:
@require_ray
@pytest.mark.asyncio
async def test_iterative_tiling(ray_start_regular_shared2, create_cluster):
    await test_local.test_iterative_tiling(create_cluster)
    time.sleep(100000)

Expected behavior
A clear and concise description of what you expected to happen.

Additional context
Add any other context about the problem here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants