Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix how to calculate error_count in flags['update_logs'] for courses #5580

Open
gabina opened this issue Jan 11, 2024 · 4 comments
Open

Fix how to calculate error_count in flags['update_logs'] for courses #5580

gabina opened this issue Jan 11, 2024 · 4 comments

Comments

@gabina
Copy link
Member

gabina commented Jan 11, 2024

What is happening?

Some context

Courses have a flags field with some metadadata. In particular, the update_logs flag has information about course updates. The error_count flag inside the update_logs counts how many errors occurs during the update process. See UpdateLogger.update_course method to understand where that flag is updated.

The UpdateLogger.update_course method gets called during the UpdateCourseStats.initialize with the following hash as the new log parameter:

'start_time' => @start_time.to_datetime,
'end_time' => @end_time.to_datetime,
'sentry_tag_uuid' => sentry_tag_uuid,
'error_count' => error_count

That error_count value comes from UpdateServiceErrorHelper.error_count definition, which basically keeps a counter for the error count. That counter is incremented each time update_error_stats gets called, and currently that method is only invoked in the ApiErrorHandling.log_error method.

The problem

The LiftWingApi class implements the log_error_batch method to log errors in Sentry at the batch level. That means, when you call get_revision_data for a batch of revision ids, the errors for the individual calls are accumulated and, once we're done with the batch, we log all the errors (if any) in a single Sentry log. This is to prevent a systematic problem where every request is failing to saturate our Sentry event quota.

The problem here is that the error_count counter gets incremented at the batch level too. Notice that if there are 30 errors from a batch of 50 revisions, log_error_batch method will get invoked only once, meaning the same thing for ApiErrorHandling.log_error and thus for update_error_stats. Because of that, the error_counter will be incremented by one, when there were 30 new errors.

TLDR; one option could be to make UpdateServiceErrorHelper.update_error_stats method take in an optional argument new_errors_count to increment the @error_count by that number (and not always by 1).
Notice that the ApiErrorHandling.log_error already has the actual "new errors count" value in the sentry_extra parameter.

To Reproduce

The "tracks update errors properly in LiftWing" spec in spec/services/update_course_stats_spec.rb shows that the 'error_count' flag correspond to the number of batches instead of to the number of individual errors.

Expected behavior

flags['update_logs'][n]['error_count'] should be the real number of errors that occurred during the update process.

@gabina
Copy link
Member Author

gabina commented Jan 12, 2024

rails and newcomer friendly are good labels for this issue (I don't think I have access to add them)

@mit-27
Copy link

mit-27 commented Jan 22, 2024

I would like to work on this issue. @ragesoss

@Joice-crypto
Copy link
Contributor

Hello @gabina, I'm working on this and I have a question. Did you mean in the sentry_extra: { rev_ids:, project_code: wiki_key, project_model: model_key, error_count: @errors.count }) the error_count: @errors.count refers to the new errors count?

@gabina
Copy link
Member Author

gabina commented Feb 20, 2024

Yes, for LiftWingApi and ReferenceCounterApi classes the errors count is sent to the ApiErrorHandling.log_error through the sentry_extra[:error_count] parameter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants