Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

checkin from rogue runners #27

Open
jb098 opened this issue Jul 26, 2019 · 2 comments
Open

checkin from rogue runners #27

jb098 opened this issue Jul 26, 2019 · 2 comments

Comments

@jb098
Copy link
Collaborator

jb098 commented Jul 26, 2019

In the mite/scenario.py, ScenarioManager.checkin_data works perfectly under normal circumstances.

If a runner is hagning around waiting to check back in from a previous test, this causes an exception.

It's not a problem that it causes an exception, it's a problem that it's not clear what has caused it. We shoudld try and gracefully exit and log out what the issue might be to give the user a hint.

@jb098
Copy link
Collaborator Author

jb098 commented Nov 13, 2019

@aecay is this still a problem? I haven't put any fixes in for it but I know we've changed a little bit. I tried to fix once but my solution hamstrung the performance and I couldn't work out why.

@aecay
Copy link
Collaborator

aecay commented Nov 13, 2019

It still is. My sketch of a fix is this:

  • have each controller process give itself an ID (CID, controller ID)
  • put the CID on all the messages from controller -> runner
  • make the runners include the CID whenever it responds to a message
  • make the controller ignore messages with an unknown CID (currently it sometimes throws in this scenario, because it mixes invalid data from an old message into its valid data)

This will have some performance overhead by swelling slightly the controller <-> runner traffic, but I hope it will not be too bad. I think it's a simple change, but I haven't gotten around to implementing it because busy, and it will require careful testing for correctness and (especially) performance impacts.

Do you remember what your earlier attempt at a fix was? If you had a problem here, that could be an indication that it's not as simple as I'm imagining it to be... 😟

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants