Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Priam should order stop and start #753

Open
hashbrowncipher opened this issue Oct 31, 2018 · 2 comments
Open

Priam should order stop and start #753

hashbrowncipher opened this issue Oct 31, 2018 · 2 comments

Comments

@hashbrowncipher
Copy link
Contributor

Describe the bug
It is possible for a sequence of start and stop API calls to leave Cassandra in a state not matching the last call made.

To Reproduce
Call /cassadmin/stop (don't wait for it to return)
Call /cassadmin/start

Observed behavior
Cassandra is left down.

Expected behavior
Either of:
a) Cassandra stops but immediately starts again
b) The pending stop is cancelled.

Version: Priam 3.1.63

@arunagrawal84
Copy link
Contributor

We could have 2 ways to solve this problem:

  1. Have a queue ahead of Cassandra start/stop operations and then deque them.
  2. Have a lock which will throw an error for simultaneous operations.

We have taken the second approach when operator issues multiple cluster management tasks: flush, compactions, snapshots etc. The expectation is to ensure operator/script will wait for one operation to finish before executing other operation of similar type. e.x. the operator cannot execute two compactions but one flush and one compaction are ok.

In the above context, it is technically not "similar operation" but "dependent operations". I would like to throw an exception instead of enqueueing operations as then which operation came "first" is something that operator will need to know. I like when it is simple that second request just fails saying - something is running. Try later!

Thoughts?

@hashbrowncipher
Copy link
Contributor Author

I think that there should be a flag for the desired state (which already exists, iirc), and a thread to bring the state of the world into harmony with the desired state. Whenever the thread finishes an operation (be it stop or start), it should check "has the flag changed?" and if so, begin the loop again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants