Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

radical-analytics-inspect warnings and NoneType error #117

Open
mturilli opened this issue Feb 25, 2020 · 7 comments
Open

radical-analytics-inspect warnings and NoneType error #117

mturilli opened this issue Feb 25, 2020 · 7 comments

Comments

@mturilli
Copy link
Contributor

$ radical-stack 

  python               : 3.6.9
  pythonpath           : 
  virtualenv           : /home/mturilli/Virtualenvs/rp-paper-frontera

  radical.analytics    : 0.90.7-v0.72.0-38-g14b9581@devel
  radical.pilot        : 1.1.1-v1.1.1-9-g353c5876e@devel
  radical.saga         : 1.1.0-v1.1-10-g4cfdc77f@devel
  radical.utils        : 1.1.1-v1.1.1-14-g77ca0db@devel

Warnings and error:

$ ~/github/radical.analytics/bin/radical-analytics-inspect `pwd`/rp.session.login
2.frontera.tacc.utexas.edu.mturilli.018316.0002
rp.session.login2.frontera.tacc.utexas.edu.mturilli.018316.0002 cache read failed: Ran out of input
WARNING: profile "/home/mturilli/github/experiments/rp.paper/rawdata/spatial_heterogeneity/rp.session.login2.frontera.tacc.utexas.edu.mturilli.018316.0002/umgr_unschedule_pubsub.prof" not correctly closed.
WARNING: profile "/home/mturilli/github/experiments/rp.paper/rawdata/spatial_heterogeneity/rp.session.login2.frontera.tacc.utexas.edu.mturilli.018316.0002/umgr_scheduling_queue.prof" not correctly closed.
WARNING: profile "/home/mturilli/github/experiments/rp.paper/rawdata/spatial_heterogeneity/rp.session.login2.frontera.tacc.utexas.edu.mturilli.018316.0002/cmgr.0000.hb.prof" not correctly closed.
WARNING: profile "/home/mturilli/github/experiments/rp.paper/rawdata/spatial_heterogeneity/rp.session.login2.frontera.tacc.utexas.edu.mturilli.018316.0002/log_pubsub.prof" not correctly closed.
session loaded
Traceback (most recent call last):
  File "/home/mturilli/github/radical.analytics/bin/rp_inspect/plot_state.py", line 100, in <module>
    key=lambda v: v[1][index])]
TypeError: '<' not supported between instances of 'NoneType' and 'float'
...Traceback (most recent call last):
  File "/home/mturilli/github/radical.analytics/bin/rp_inspect/plot_util.py", line 116, in <module>
    prov, cons, stats_abs, stats_rel, info = session.utilization(metrics)
  File "/home/mturilli/Virtualenvs/rp-paper-frontera/lib/python3.6/site-packages/radical/analytics/session.py", line 975, in utilization
    provided  = rp.utils.get_provided_resources(self)
  File "/home/mturilli/Virtualenvs/rp-paper-frontera/lib/python3.6/site-packages/radical/pilot/utils/prof_utils.py", line 479, in get_provided_resources
    data = _get_pilot_provision(session, p)
  File "/home/mturilli/Virtualenvs/rp-paper-frontera/lib/python3.6/site-packages/radical/pilot/utils/prof_utils.py", line 437, in _get_pilot_provision
    cpn   = pilot.cfg['resource_details']['rm_info']['cores_per_node']
TypeError: 'NoneType' object is not subscriptable
 done
@mturilli
Copy link
Contributor Author

@andre-merzky ping

@lee212
Copy link
Contributor

lee212 commented Nov 30, 2020

Same experience with recent session data:

$ bin/rp_inspect/plot_util.py re.session.login2.iyakushin.018593.0000
Traceback (most recent call last):
  File "bin/rp_inspect/plot_util.py", line 118, in <module>
    prov, cons, stats_abs, stats_rel, info = session.utilization(metrics)
  File "/ccs/home/hrlee/.conda/envs/ipynb/lib/python3.7/site-packages/radical/analytics/session.py", line 990, in utilization
    provided  = rp.utils.get_provided_resources(self)
  File "/ccs/home/hrlee/.conda/envs/ipynb/lib/python3.7/site-packages/radical/pilot/utils/prof_utils.py", line 856, in get_provided_resources
    data = _get_pilot_provision(p)
  File "/ccs/home/hrlee/.conda/envs/ipynb/lib/python3.7/site-packages/radical/pilot/utils/prof_utils.py", line 814, in _get_pilot_provision
    cpn   = pilot.cfg['resource_details']['rm_info']['cores_per_node']
TypeError: 'NoneType' object is not subscriptable

The session is here: https://github.com/radical-experiments/deepdriveMD/tree/master/data/async

@mturilli mturilli self-assigned this Dec 2, 2020
@lee212
Copy link
Contributor

lee212 commented Dec 6, 2020

another session data added and same error message: re.session.login2.iyakushin.018598.0002.tar.gz

@andre-merzky
Copy link
Member

What is the radical stack you are using by now?

@andre-merzky
Copy link
Member

andre-merzky commented Dec 6, 2020

Never mind, found that:

 radical.pilot version: 1.5.4
 radical.saga  version: 1.5.6
 radical.utils version: 1.5.4

I am quite surprised that this still is an issue - the stack is up to date, and I can't really see how the resource details go missing (log shows they are written to the DB all right).

I prepared an RP branch fix/issue_ra_117 to dig into this a bit deeper -- that branch removes the client side setting for resource_details completely, so we should be able to distinguish if the client or pilot side is at fault. Can you please give this a try and see how that goes? Thanks!

@lee212
Copy link
Contributor

lee212 commented Dec 9, 2020

@andre-merzky , I tried the branch, and I still see the same error, does this mean that the fault is at pilot side?

@andre-merzky
Copy link
Member

Let me try to reproduce this, please. Can you provide a small example script, ideally with a fake workload (as presumably the workload should not matter) which I can run? Do you see the same problem on other resources too (with that script)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants