Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Progress now account for more things #228

Merged
merged 3 commits into from
Jan 8, 2021

Conversation

shuhaowu
Copy link
Contributor

@shuhaowu shuhaowu commented Jan 6, 2021

Related: #226

Added the current number of active data iterators as well as the number of rows copied per table into the progress struct. This will allow us to track the performance of a Ghostferry move with a bit more precision.

The code is a bit crappy right now as we get this functionality working and measuring in real moves. The reason it's crappy is because the StateTracker now has two roles: tracking the state which is vital for interrupt and resume, as well as measuring some optional performance metrics. These two objectives ideally would be refactored so they're not as closely intertwined.

@shuhaowu shuhaowu marked this pull request as ready for review January 7, 2021 12:55
@@ -32,6 +33,9 @@ type Progress struct {
BinlogStreamerLag float64 // seconds
Throttled bool

// The number of data iterators currently active.
ActiveDataIterators int
Copy link
Contributor

@tiwilliam tiwilliam Jan 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to show especially this one in the UI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think a ControlServer refactor is coming after #227 cc @Manan007224

Copy link
Contributor

@fjordan fjordan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, but I think we should just remove all the bytes written work until we implement it.

row_batch.go Outdated Show resolved Hide resolved
@@ -20,6 +20,8 @@ def test_progress_callback
assert_equal 1111, progress.last["Tables"]["gftest.test_table_1"]["TargetPaginationKey"]
assert_equal "completed", progress.last["Tables"]["gftest.test_table_1"]["CurrentAction"]

assert_equal 0, progress.last["ActiveDataIterators"]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that we receive more than one progress reports during the ghostferry run? If so, can you add a check that the ActiveDataIterators is >0 in one of the non-last progresses?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do sometimes, so it will introduce flakiness. I've manually verified these feature for now.

we really a better integration testing environment...

@xliang6
Copy link
Contributor

xliang6 commented Jan 8, 2021

I agree with Forrest that we can hold off the code for EstimatedBytes till we know and have time to implement it. This PR looks good otherwise.

@shuhaowu shuhaowu merged commit 006d45d into Shopify:master Jan 8, 2021
@shuhaowu shuhaowu deleted the more-progress-info branch January 8, 2021 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants