Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing notifications and widget updates #107

Open
zomars opened this issue Feb 26, 2024 · 16 comments
Open

Missing notifications and widget updates #107

zomars opened this issue Feb 26, 2024 · 16 comments
Labels
bug Something isn't working in progress

Comments

@zomars
Copy link

zomars commented Feb 26, 2024

Please give a detailed description of the issue you’re experiencing or the feedback you’d like to provide.
Feel free to attach any relevant screenshots or logs, and please keep the app version and device info in the issue!

App Version: 3.2.2 Device: iPhone 12 mini
OS: iOS 17.2.1

To replicate:

  • Enable notifications
  • Disable from system settings
  • Re-enable them
  • Notice how notifications stop working even after resubscribing
@mfts
Copy link

mfts commented Mar 5, 2024

Same here.

I haven't been able to narrow it down yet. But I noticed that the status in Zeitgeist doesn't update from "Building" to "Deployed" even though the build has already concluded.

I think this would normally trigger the notifications, therefore by not changing the state, no notifications are triggered.
And if you open the app, then the state of the build is updated, and obviously no notification triggered because you are resetting the notificationlistener.

Normally builds are around 3mins but it still says building at 4+ mins and if I didn't open the app then it would continue to say "building".
IMG_6390

  • App Version: 3.2.2
  • Device: iPhone15,3
  • OS: iOS 17.3.1

@erkage
Copy link

erkage commented Mar 20, 2024

Same here

@nerdmax
Copy link

nerdmax commented Mar 29, 2024

I'm having the same problem. Any update on this issue?

@zomars
Copy link
Author

zomars commented Mar 29, 2024

I don't think @daneden is active anymore. Last update was 4 months ago.

@daneden
Copy link
Owner

daneden commented Apr 3, 2024

Hey folks! I’m sorry for the delay in response and lack of activity on Zeitgeist lately. To be totally transparent and frank, I work a full-time job and have parenting responsibilities outside of that, and my side projects (including Zeitgeist) come after those priorities.

Vercel regularly updates their APIs which sometimes causes bugs or lapses in functionality in the app. When these changes are combined with Apple’s APNS (the notification delivery system, which has very stringent requirements and may throttle or disable notifications under different circumstances), it can make debugging and identifying root causes challenging.

I really appreciate the details people have already added to this issue and any additional information you can provide will make it much easier for me to debug and fix this issue quickly once I have some spare time to do it.

Thanks so much for your patience and understanding, and for being Zeitgeist users!

@mfts
Copy link

mfts commented Apr 5, 2024

No need to apologize @daneden. Fully aware this is just a hobby project and I appreciate your effort a lot.

@zomars
Copy link
Author

zomars commented Apr 5, 2024

We understand and agreed with @mfts about the sentiment about being a hobby project. Still in order to use the notification features you need pay for it. Which IMO it puts some responsibility into maintaining the feature. I know this project may not provide as much as you full time job but I would love for the incentives to be aligned so it works for both you and your users/customers.

@daneden
Copy link
Owner

daneden commented Apr 6, 2024

Hey folks, I’m spending some time this weekend investigating this issue and wanted to provide some transparency (and solicit some collective brainpower!) into the debugging process. Let's start with how Zeitgeist’s notifications are designed to work.

flowchart LR
    A(Vercel) -->|Webhook| B(Zeitgeist Server)
    B -->|APNs Payload| C(APNS Server)
    C -->|APNs Background Push| D(Device)
    D -->|Background update| E(Widgets and UI)
    D --> F{Notifications enabled for project?}
    F -->|Yes| G(Send notification)
    F -->|No| H(Suppress notification)

As far as I can tell, everything from the webhook through to the APNs delivery is working as expected. I have error logging on Zeitgeist’s server, which acts as the transport layer between Vercel’s webhooks and APNs. These logs haven’t shown any issues with delivering notifications to APNs, so the next possible failure stage is within APNs itself.

Screenshot 2024-04-06 at 13 49 48

As you can see, over the last week, APNs sent over 100k notifications. This number is much higher than the actual number of devices that can receive a notification, because APNs device IDs expire over time, or devices may become unused, or the application uninstalled. Here's a table breaking down the delivery status of notifications over the last 7 days:

name 2024-03-29 2024-03-30 2024-03-31 2024-04-01 2024-04-02 2024-04-03 2024-04-04
Received by APNs 19679 8280 7616 14264 21938 23126 22404
Delivered to Device 2348 807 811 1845 2564 2268 2191
Delivered to Device (From Storage) 376 155 135 247 429 410 442
Stored - Device Offline 10782 4828 4007 7496 11940 12406 13004
Stored - Power Considerations 891 232 176 349 919 991 1146
Discarded - Token Unregistered 627 83 138 632 989 1657 1282
Discarded - Token Unregistered (From Storage) 7 1 4 4 5 5 10
Discarded - Expired 0 0 0 0 0 0 0
Discarded - Disabled 4616 2188 2336 3690 5044 5358 4312
Discarded - Disabled (From Storage) 17 7 4 8 9 13 11

Let's look at March 29th as an example:

sankey-beta

Received by APNs,Delivered to Device,2348
Received by APNs,Delivered to Device (From Storage),376
Received by APNs,"Stored - Device Offline",10782
Received by APNs,"Stored - Power Considerations",891
Received by APNs,"Discarded - Token Unregistered",627
Received by APNs,"Discarded - Disabled",4616
Received by APNs,Other Status,24

Apple has some helpful documentation on how to interpret these delivery statuses.

The main groupings to look at here are Delivered to Device and Discarded - Disabled: these are basically devices with notifications enabled and disabled respectively. (“Stored - Device Offline” obviously makes up a large chunk of these notifications, but these are typically “unexpired” device tokens i.e. devices which had their notification token reset after an OS update, reset, or other system change)

Now, I don’t log the number of people who turn on notifications for deployments, but this more or less aligns with my expectations (that only about ⅓ of people turn on notifications for any projects).

This brings me to Zeitgeist’s handling of received background notifications. The handler is defined here:

@discardableResult
func handleBackgroundNotification(_ userInfo: [AnyHashable: Any]) async -> RemoteNotificationResult {
print("Received remote notification")
#if canImport(WidgetKit)
WidgetCenter.shared.reloadAllTimelines()
#endif
await DataTaskModifier.postNotification(userInfo)
do {
let title = userInfo["title"] as? String
guard let body = userInfo["body"] as? String else {
throw ZPSError.FieldCastingError(field: userInfo["body"])
}
guard let projectId = userInfo["projectId"] as? String else {
throw ZPSError.FieldCastingError(field: userInfo["projectId"])
}
let deploymentId: String? = userInfo["deploymentId"] as? String
let teamId: String? = userInfo["teamId"] as? String
let userId: String? = userInfo["userId"] as? String
guard let accountId = teamId ?? userId,
Preferences.accounts.contains(where: { $0.id == accountId }) else {
return .noData
}
let target: String? = userInfo["target"] as? String
guard let eventType: ZPSEventType = ZPSEventType(rawValue: userInfo["eventType"] as? String ?? "") else {
throw ZPSError.EventTypeCastingError(eventType: userInfo["eventType"])
}
guard NotificationManager.userAllowedNotifications(
for: eventType,
with: projectId,
target: VercelDeployment.Target(rawValue: target ?? "")
) else {
print("Notification suppressed due to user preferences")
return .newData
}
let content = UNMutableNotificationContent()
if let title = title {
content.title = title
content.body = body
} else {
content.title = body
}
if notificationEmoji {
content.title = "\(eventType.emojiPrefix)\(content.title)"
}
content.sound = .default
switch notificationGrouping {
case .account:
content.threadIdentifier = teamId ?? userId ?? "accountForProject-\(projectId)"
case .project:
content.threadIdentifier = projectId
case .deployment:
content.threadIdentifier = deploymentId ?? projectId
}
content.categoryIdentifier = eventType.rawValue
content.userInfo = [
"DEPLOYMENT_ID": "\(deploymentId ?? "nil")",
"TEAM_ID": "\(teamId ?? "-1")",
"PROJECT_ID": "\(projectId)",
]
let notificationID = "\(content.threadIdentifier)-\(eventType.rawValue)"
let request = UNNotificationRequest(identifier: notificationID, content: content, trigger: nil)
print("Pushing notification with ID \(notificationID)")
try await UNUserNotificationCenter.current().add(request)
return .newData
} catch {
switch error {
case let ZPSError.FieldCastingError(field):
print(field.debugDescription)
case let ZPSError.EventTypeCastingError(eventType):
print(eventType.debugDescription)
default:
print("Unknown error occured when handling background notification")
}
print(error.localizedDescription)
return .failed
}
}

The handler has three main stages:

  1. Refresh widget timelines
  2. Post a notification to the app UI, refreshing views where applicable (this is what allows realtime updates in project and deployment views)
  3. Check if the notification should be shown based on user preferences, and if so, deliver a notification to the user

Now I’ll be honest, I’ve experienced this notification bug too, and the really frustrating thing is, every time I try to investigate, I can’t reproduce it, because debugging the app by running it locally on my device seems to “fix” the issue straight away, and reinstalling the App Store version puts it back into a broken state.

The only potential regression I can think of that may be causing this bug is switching from a synchronous to an asynchronous version of UIAppDelegate’s didReceiveRemoteNotification function. I’m going to test a version that uses the synchronous version and will report back!

@daneden
Copy link
Owner

daneden commented Apr 10, 2024

Hey folks, I just wanted to provide a quick update here!

Unfortunately my attempted fix doesn’t seem to have resolved the issue or changed the symptoms. I’m trying to debug the issue using Apple’s APNs console, but first need to be able to get notification IDs, which in turn requires some help from the maintainers of node-apn/node-apn#733.

Sending some notifications manually using the APNs console works perfectly, and as described above my notification-sending service doesn’t report any problems, so I’ll also work on adding some analytics to my TestFlight build to try and do some reporting on how the local notification-sending function is performing.

@mfts
Copy link

mfts commented Apr 11, 2024

I noticed that when using a widget the widget build status was not updated: still shows "building" instead of "deployed" (see my earlier message above)

So may it be connected with how the build statuses are updated in the app that the notifications don't fire in prod.

@daneden
Copy link
Owner

daneden commented Apr 11, 2024

@mfts Yeah this is not surprising; if you see my diagram above you'll see that notifications and widgets are both updated by the same remote background notification. So there's either something wrong with the delivery of the background notification from APNs (which I'll need help from node-apn to debug), or a problem with how the background notification is processed on-device (more likely, and will need me to put some analytics on-device to debug)

@mfts
Copy link

mfts commented Apr 11, 2024

Happy to turn on telemetry on my end when you ship it

@daneden
Copy link
Owner

daneden commented Apr 12, 2024

Hey folks! I think I’ve managed to complete my investigation and identify the main cause of this issue: rate limiting.

iOS imposes an undisclosed rate limit on silent/background notifications. The guidance is vague, but suggests limiting silent notifications to just 2 to 3 per hour.

This poses some challenges for Zeitgeist for two reasons:

  1. Background notifications are what allow the widgets to remain fresh/up-to-date even when users haven’t enabled alerts from the app
  2. Having all events sent as background notifications allows users to only opt in to notifications for projects/events they want to receive. For example, I have 20+ projects, most with dependabot regularly pushing updates. Zeitgeist receives background updates for all these projects and deployment events, but I only want to receive alerts for 1 or 2 projects and can configure that per-project

As far as I can tell, the only way I can preserve (2) above is by storing user preferences for notification settings on a remote server and switching from remote background to remote alert notifications, which:

  1. Introduces a new network dependency, which I don't love
  2. Would cause notification settings to be synced across devices, which may not be desirable from a user perspective
  3. Doesn't solve the problem of keeping widgets up-to-date

I’m going to post about this topic in the Apple Developer forums to try and seek the advice of any Apple engineers or seasoned iOS developers. I’ll post the link to the forum post here when I get around to it!

@daneden
Copy link
Owner

daneden commented Apr 22, 2024

For those of you with a relatively low number of projects/deployments (i.e. typically fewer than 1 deployment per hour), your notifications and widgets should be working again once you update to v3.2.3 from the App Store. I’ve adjusted the code both on the app and servers to honour Apple's guidelines until I have a solid plan for overcoming their opaque constrains. Thanks for your patience with this issue, everyone!

@daneden daneden added bug Something isn't working in progress labels Apr 22, 2024
@daneden daneden changed the title Notifications stopped working Missing notifications and widget updates Apr 22, 2024
@steveDvlpr
Copy link

@daneden I only have a few projects hosted on Vercel with new builds happening very rarely. If you need someone to try new features or debug existing ones, please feel free to send my an invite for TestFlight. Thanks!

@mfts
Copy link

mfts commented Apr 25, 2024

I can confirm that the fix works for me. Every start/finish build notification arrives. Only tracking a single project

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working in progress
Projects
None yet
Development

No branches or pull requests

6 participants