Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Router: Ensure FCM 502 Errors are being properly handled. #444

Open
data-sync-user opened this issue Sep 26, 2023 · 2 comments
Open

Router: Ensure FCM 502 Errors are being properly handled. #444

data-sync-user opened this issue Sep 26, 2023 · 2 comments
Assignees

Comments

@data-sync-user
Copy link
Collaborator

data-sync-user commented Sep 26, 2023

FCM can return a 502 error which we are currently logging as a Sentry error. This may cause a subsequent JSON decoding error if the 502 response is not proper JSON formatting (which appears to happen frequently).

https://mozilla.sentry.io/issues/4552231261/?environment=prod-gcp&environment=prod&query=is%3Aunresolved+rust.name%3Arustc&referrer=issue-stream&stream_index=2

We should isolate the BAD_GATEWAY response, not try to decode the payload, and report the error back to endpoint as a 502 with RETRY.

Sentry Issue: AUTOPUSH-RS-3X

┆Issue is synchronized with this Jira Bug

@froodian
Copy link

froodian commented Oct 9, 2023

Any info or update on this issue? We've seen an ongoing material rise in persistent (not resolving on retry) Web Push 502s to Android Firefox users, with response bodies like

{"code":502,"errno":null,"error":"Bad Gateway","message":"Unable to deserialize FCM response","more_info":"http://autopush.readthedocs.io/en/latest/http.html#error-codes"}

@jrconlin
Copy link
Member

Sorry for the late reply.

Starting Sept 22, we had an incident where we could no longer send messages to Android users via the old, Google Cloud Messaging (GCM) network. This would impact users who may have created very old endpoints using Firefox for Android (Fennec), which was discontinued several years ago.

As of Oct. 05, we deployed a "canary" fix to address the issue. This did manage to address the bulk of our issue. Due to the weekend and holiday, however, we've held off on deploying the fix to the larger server population, however that version should be widely deployed now (Oct 10).

@data-sync-user data-sync-user changed the title Router: Unable to deserialize FCM response Router: Ensure FCM 502 Errors are being properly handled. Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants