Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sapphire RuntimeError causes fatal crash of updatehistograms job #191

Open
tomkooij opened this issue Nov 4, 2019 · 3 comments
Open

sapphire RuntimeError causes fatal crash of updatehistograms job #191

tomkooij opened this issue Nov 4, 2019 · 3 comments

Comments

@tomkooij
Copy link
Member

tomkooij commented Nov 4, 2019

On Oct 30 (updatehistograms of oct 29 data) there was some error in sapphire while reconstructing events. (Due to s9 uploading an invalid config, it was fixed automatically when s9 uploaded a new config)

However this crashes/stops the entire updatehistograms job. Should we wrap these calls in try: except: to prevent this? (Or solve this somewhere else??)

The IndexError (list index out of range) was at: https://github.com/HiSPARC/sapphire/blob/master/sapphire/analysis/core_reconstruction.py#L69
Because s9 uploaded a config without slave data while the station has 4 detectors this errored.

Log:


DEBUG:publicdb.histograms.jobs:Determining detector timing offsets for Summary: 9 - 29 Oct 2019
DEBUG:publicdb.histograms.jobs:Saving detector timing offsets for Summary: 9 - 29 Oct 2019
DEBUG:publicdb.histograms.jobs:Saved succesfully
ERROR:sentry.errors.serializer:the file object is closed
Traceback (most recent call last):
  File "/srv/publicdb/publicdb_venv/lib/python2.7/site-packages/raven/utils/serializer/manager.py", line 76, in transform
    return repr(value)
  File "tables/tableextension.pyx", line 1634, in tables.tableextension.Row.__repr__
  File "tables/tableextension.pyx", line 1626, in tables.tableextension.Row.__str__
  File "tables/tableextension.pyx", line 746, in tables.tableextension.Row.table.__get__
  File "/srv/publicdb/publicdb_venv/lib/python2.7/site-packages/tables/file.py", line 2159, in _check_open
    raise ClosedFileError("the file object is closed")
ClosedFileError: the file object is closed
ERROR:sentry.errors.serializer:the file object is closed
Traceback (most recent call last):
  File "/srv/publicdb/publicdb_venv/lib/python2.7/site-packages/raven/utils/serializer/manager.py", line 76, in transform
    return repr(value)
  File "tables/tableextension.pyx", line 1634, in tables.tableextension.Row.__repr__
  File "tables/tableextension.pyx", line 1626, in tables.tableextension.Row.__str__
  File "tables/tableextension.pyx", line 746, in tables.tableextension.Row.table.__get__
  File "/srv/publicdb/publicdb_venv/lib/python2.7/site-packages/tables/file.py", line 2159, in _check_open
    raise ClosedFileError("the file object is closed")
ClosedFileError: the file object is closed
Traceback (most recent call last):
  File "/srv/publicdb/www/manage.py", line 10, in <module>
    execute_from_command_line(sys.argv)
  File "/srv/publicdb/publicdb_venv/lib/python2.7/site-packages/django/core/management/__init__.py", line 364, in execute_
from_command_line
    utility.execute()
  File "/srv/publicdb/publicdb_venv/lib/python2.7/site-packages/django/core/management/__init__.py", line 356, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/srv/publicdb/publicdb_venv/lib/python2.7/site-packages/django/core/management/base.py", line 283, in run_from_arg
v
    self.execute(*args, **cmd_options)
  File "/srv/publicdb/publicdb_venv/lib/python2.7/site-packages/django/core/management/base.py", line 330, in execute
    output = self.handle(*args, **options)
  File "/srv/publicdb/www/publicdb/histograms/management/commands/updatehistograms.py", line 23, in handle
    completed = update_all_histograms()
  File "/srv/publicdb/www/publicdb/histograms/jobs.py", line 61, in update_all_histograms
    perform_update_tasks()
  File "/srv/publicdb/www/publicdb/histograms/jobs.py", line 84, in perform_update_tasks
    update_histograms()
  File "/srv/publicdb/www/publicdb/histograms/jobs.py", line 201, in update_histograms
    perform_tasks_manager(Summary, "needs_update_events", perform_events_tasks)
  File "/srv/publicdb/www/publicdb/histograms/jobs.py", line 247, in perform_tasks_manager
    summary, tmp_locations = perform_certain_tasks(summary)
  File "/srv/publicdb/www/publicdb/histograms/jobs.py", line 265, in perform_events_tasks
    tmp_locations.append(esd.reconstruct_events_and_store_temporary_esd(summary))
  File "/srv/publicdb/www/publicdb/histograms/esd.py", line 174, in reconstruct_events_and_store_temporary_esd
    reconstruct.reconstruct_and_store()
  File "/srv/publicdb/publicdb_venv/lib/python2.7/site-packages/sapphire/analysis/reconstructions.py", line 116, in recons
truct_and_store
    self.reconstruct_cores(detector_ids=detector_ids)
  File "/srv/publicdb/publicdb_venv/lib/python2.7/site-packages/sapphire/analysis/reconstructions.py", line 147, in recons
truct_cores
    self.progress, initials)
  File "/srv/publicdb/publicdb_venv/lib/python2.7/site-packages/sapphire/analysis/core_reconstruction.py", line 100, in re
construct_events
    for event, initial in events_init]
  File "/srv/publicdb/publicdb_venv/lib/python2.7/site-packages/sapphire/analysis/core_reconstruction.py", line 69, in rec
onstruct_event
    dx, dy, dz = self.station.detectors[id].get_coordinates()
IndexError: list index out of range
Sentry is attempting to send 1 pending error messages
Waiting up to 10 seconds
Press Ctrl-C to quit

(ignore the sentry errors, it's the IndexError that caused this) also at sentry.io

@davidfokkema
Copy link
Member

It's always better to catch errors and log them, but since the Raspberry Pi at s9 will be replaced with a W10 PC it is very unlikely that we will encounter this problem again. Also, this problem hasn't occurred during the past few years, I think?

@tomkooij
Copy link
Member Author

tomkooij commented Nov 4, 2019

@davidfokkema : This problem (at other stations) has occured 4 times the past few weeks according to sentry.io.

However, we have encountered it quite frequently when adding new stations.

But I'll just fix the sapphire side and leave the publicdb jobs alone as long as these errors do not occur frequently. (If it ain't broken don't fix it)

@tomkooij
Copy link
Member Author

tomkooij commented Nov 4, 2019

I thought I opened this over at hisparc/publicdb ... oops. This is an publicdb.histogramjobs issue.

I'll create a new issue #192 to describe the problem in sapphire.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants