resolve potential data race in replaceNotification on batched edit notifications? #5764
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
First time contributor checklist
Contributor checklist
... to the best of my knowledge
I have tested my contribution on these devices:I have not tested changes because the situation is a hypothetical and/or difficult to reproduce data race.
That said, whereas the whole effect of the change is re-ordering two occurrences of 2 sequential lines of code to an order more logical, I would expect this to:
UNLESS a more experienced iOS programmer knows a specific reason for the extant order to be preferred
(and if that is the case, I would suggest that a
// XXX:
comment or similar is in order to explain the less-intuitive order)Description
It seems presently possible that a notification can fail-to-cancel & end up left [near-]duplicated instead, as a result of precise unfortunate timing wherein:
transits pending → delivered
removeDeliveredNotifications(withIdentifiers:)
removePendingNotificationRequests(withIdentifiers:)
which appear presently in that order, in error?
I have not been able to come up with any1 reasoning for this order to be preferred.
Context suggests that previous touching authors (in de61926 and af6ff96 ) errantly believed2 order to be irrelevant (owing to claims of synchronous action that are not met by the underlying iOS API)
My suspicion there could be a good, unspecified reason for this order obscure to me is further weakened by noticing that the analogous calls to
removeAll{Delivered,Pending}Notifications
happen in [this PR's / the intuitively-safer] pending-then-delivered order.That has been the case for the
removeAll*
calls since introduction to present, to wit:as introduced in 1bfe691
Because of the many files involved, and github's collapsing thereof, it seemed worth detailing my blame-findings that
➕ these lines:
Signal-iOS/Signal/src/UserInterface/Notifications/UserNotificationsAdaptee.swift
Lines 174 to 178 in 1bfe691
were swift introduced in this file expanded greatly/replaced in this commit
➖ apparently ultimately replacing this Objective-C:
Signal-iOS/Signal/src/environment/NotificationsManager.m
Lines 498 to 503 in 3123842
from files removed
(& where 3123842 is
1bfe691~1
3)❓ that said I don't fully know what to make of the loop-y swift version:
Signal-iOS/Signal/src/UserInterface/Notifications/LegacyNotificationsAdaptee.swift
Lines 207 to 212 in 1bfe691
introduced at the same time
(^this author hasn't touched the
withIdentifiers:
lines where they live now—and I thought that represented a preference matching my intuition w/o contradiction—but *sigh* in fact, I've belatedly realized they did introduced mismatched delivered-then-pending usage as
cancelNotification
only 23 lines up same file in same commit 🤔left this (inconsistent) way during aforementioned refactor of af6ff96
In particular,
I believe this race may on occasion be actually occurring when e.g. multiple edits to the same message are being (belatedly) batch processed back-to-back
(after re-establishment of interrupted network connection;
or possibly mere sleep and resume of envelope processing per iOS Background App refresh)
(Except for) hoisting this out for emphasis —and also because apparently the callout parsing is suppressed inside a
<details>
even where other markdown is processed (and I don't think there is even effective html option to get it back?)Important
In particular, a friend with whom I frequently converse informed me that this claim [to wit "edits… don't generate push notifications ... at least"] was wrong that they were:
... This, with foreknowledge of further investigation suggests that this race (or similar, such as Apple simply not completing requested cancellation for other reasons) is occurring, not just hypothetical?
I went ahead and squashed this into a
<details>
once I realized how already also long my (originally a ) TL;DR had become.But it should answer any questions of why I am proposing this (outside of ~mismatch to intuition)
Full Motivation & relevant code path
Oops I was push-spamming actually…
I am an inveterate serial-editor. Both, as the feature is probably mainly intended, for fixing typos and such. AND as a matter of more dramatically rephrasing or expanding yet-unread messages before recipient's first read receipt fires
This last practice was one undertaken intentionally under presumption that (as informed by experience with e.g. Discord, Slack, Facebook-comments, reddit comments, etc) by making my expansions thusly I was avoiding spamming recipient with additional push notifications
(if they were not looking at Signal, but within earshot of device or had it in pocket, etc)
Earlier today, I learned this was (sort of) not true! 😱
<part that="is important-ed above">
In particular, a friend with whom I frequently converse informed me that this claim [to wit "edits… don't generate push notifications ... at least"] was wrong that they were:
... This, with foreknowledge of further investigation suggests that this race (or similar, such as Apple simply not completing requested cancellation for other reasons) is occurring, not just hypothetical?
</part>
Embarrassed at being (even) noisier than I thought I was, I decided to dive down this rabbit hole.
I discovered that indeed, the logic is explicitly to generate notifications only for edits of unread but previously delivered messages (in opposition of my intuition) here:
Signal-iOS/SignalServiceKit/src/Messages/MessageReceiver.swift
Lines 1638 to 1646 in e6ba17a
(assuming indeed that multiple dispatch makes this the path for the definition from L1591 always taken into
handleIncomingEnvelope
; as it stands I do not understand when that from L1535 would instead be run - but whereas it does not clearly generate notifications under any circumstance, it is not at issue?)…or was I?
(yes, but maybe not as bad as I briefly thought)
Now my embarrassment was somewhat relieved when I looked into the rest of the origin commit (66d7592) for that block / the present
main
pieces of that code pathand discovered:
1. at least I'm spamming quietly?
Stepping directly in from that call to
notifyUser
, then skipping over the first 117 lines that follow (including early descent into relevantnotifyUserInternal
), we come eventually on:Signal-iOS/SignalMessaging/Notifications/AppNotifications.swift
Lines 685 to 698 in e6ba17a
with line 686's suppression of noise for this notification at least... maybe4 comforting
and
2. the code clearly intends to avoid doing notification-spam
Picking up from the next line and continuing to end of function:
Signal-iOS/SignalMessaging/Notifications/AppNotifications.swift
Lines 699 to 714 in e6ba17a
its fairly clear that the intention is explicitly to replace the notification with one reflecting any edit (ironic in my particular case where only rarely has what prefix would fit in the push's preview actually changed) and not to generate an additional redundant notification
in particular things seem to prefer to drop (re-)queueing a revised notification entirely over creating duplication (under conditions where
replaceNotification
doesn't, to its knowledge, succeed - by for example failing to locate a previous version of the message(as might occur if e.g. user dismissed it but hasn't opened Signal? - in which case it is especially correct to not seemingly "grow back")
... I'm pretty sure, at least
redundancy "should" be impossible
Taking a gander at
replaceNotification
:Signal-iOS/SignalMessaging/Notifications/UserNotificationsPresenter.swift
Lines 388 to 396 in e6ba17a
we are given something of a false assurance by the name
cancelSync
, a function whichproceeds for a hefty-ish 66 lines,
Signal-iOS/SignalMessaging/Notifications/UserNotificationsPresenter.swift
Lines 454 to 520 in e6ba17a
... but as best as I've noticed is up until the last 2 lines, reordered in this PR, simply staking out early returns for various edge cases, which reifying an edit is not
meanwhile the simpler function,
cancelNotificationSync
is proceeded with a reassuring comment
// This method is thread-safe.
...except for the lies (asynchronous system API)
So, I think it's clear that these functions basically claim and their callers expect that by return from
cancelSync
the notification has been canceled.But these are synchronous only so far as Signal has control of the situation.
Take a look at the docs for the functions and you'll find:
removeDeliveredNotifications(withIdentifiers:)
https://developer.apple.com/documentation/usernotifications/unusernotificationcenter/1649500-removedeliverednotificationssays (emphasis added by me):
Parameters
identifiers
An array of
NSString
objects, each of which corresponds to a value in theidentifier
property of aUNNotificationRequest
object. This method ignores the identifiers of requests whose notifications are not currently displayed in Notification Center.Discussion
Use this method to selectively remove notifications that you no longer want displayed in Notification Center. The method executes asynchronously, returning immediately and removing the specified notifications on a background thread.
whilst
removePendingNotificationRequests(withIdentifiers:)
https://developer.apple.com/documentation/usernotifications/unusernotificationcenter/1649517-removependingnotificationrequestssays (again, emphasis added by me):<> says:
Parameters
identifiers
An array of
NSString
objects, each of which corresponds to a value in theidentifier
property of aUNNotificationRequest
object. If the identifier belongs to a non repeating request, and the trigger condition for that request has already been met, this method ignores the identifier.Discussion
The method executes asynchronously, returning immediately and removing the specified notifications on a background thread.
It's thus clear these are very much not synchronous removals
...and the blind spot (data race)
While it would not shock me if in some cases Apple simply decides that a requested removal is not actually very important (to the end at least of letting any error stand without retry, and definitely no (facilitation for) notification of the caller; but perhaps also like - battery save state, CPU load etc),
if that is not the case (and apple will be very diligent) this mismatch of
cancelSync(
named code and async-proceeding reification shouldn't actually ever result in redundant notifications like my friend reportedbut with the current order of calls, it does seem plausible that you could slip through the situation where:
removeDeliveredNotifications
, that there "The method executes asynchronously, returning immediately and removing the specified notifications on a background thread." still-so-far accurately describes the situation (cause in fact the notification is still pending, never delivered)removePendingNotificationRequests
, its now the case that "the trigger condition for that request has already been met, [so] this method ignores the identifier" (that you are now talking about a delivered notification, never canceled)That seems... like a tight timing constraint to come out that way, so I'm kinda still inclined to think maybe sometimes Apple de-prioritizes such a retraction to filth, but ... as mentioned top level, the batch-processing scenarios (e.g. upon network reconnect with multiple pending edits to same message) seem at least plausible
but maybe only if you busywait?
If in fact it's all about the data race, then this (grossly over-explained) swapping of lines ought to suffice to eliminate it. And I don't think it could hurt regardless (though again, a more experienced iOS programmer may disagree for reasons yet obscure)
But if the "lazy Apple" scenario is on the table, or if indeed you just really want to deliver on the
Sync
promise of the cancel function names (and the past tense of thedid
indidReplaceNotification
), then I think you need to actually check that the notification is gone nowas a busy-wait loop over (ideally 1 but possibly multiple) calls to
getDeliveredNotifications(completionHandler: )
and an O(N) traversal of the present set ; ditto maybegetPendingNotificationRequests(completionHandler:)
(or you can perhaps wait for it to be first delivered, since I don't think Signal is in the habit of scheduling in the later-than-ASAP future?)This feels ... wasteful, so I understand why you don't already do it (beyond possibly incorrectly assuming its unneeded)
but maybe there are other options?
falling out of the bottom of the
<details>
a better way?
by the time you've reached
cancelSync
, you already didgetNotificationsRequests
which did exactly one turn each of those O(N) loops overgetDeliveredNotifications(completionHandler: )
andgetPendingNotificationRequests(completionHandler:)
you are nominally holding a
UNNotificationRequest
, which might be for a genuinely pending notification, or may have been.request
ed off of a bonafideUNNotification
briefly held in passingeither way, you proceed to (peek at
.content.userInfo
to differentiate edge cases, but by the time we reach an attempted cancel), reduce down to theNSString
that is.identifier
targeted observation?
you held a reference to the actual notification (request) object.
Idk if in modern iOS this is a pass-by-value clone or literally a pass-by-reference handle/delegate on the notification as it exists presented in Notification Center
broadly my impression of Apple's approach in the past is that it might be the latter? (though conformance to `NSCopying may say no?)
So is there something in that that is observable? Does anything happen to it if you had a handle and the async piece of the
remove*(withIdentifiers: )
have completed? Perhaps (but perhaps not):.date
ofUNNotification
gets blanked or set to epoch?These sorts of things could at least avoid repeated O(N) fetch for a "make sure its actually gone"; also could potentially duck any busy waiting if there's something there (or indeed up on the
UNUserNotificationCenter
forNSKeyValueObserving
to target more actively?)differentiated handling?
.trigger
The wording is confusing, but it would appear possibly that the
.trigger
of the request coming into cancelSync actually should tell you whether you need to callremoveDeliveredNotifications
ORremovePendingNotificationRequests
about it, yes?per that page's
Idk if that means it actually changes or just you know ... is still there
But also regardless the flattening between the two happening in
getNotificationsRequests
doesn't have to happen per se.content
As re my specific concern about excess, and indeed potentially unnecessary re-delivery for edits, there is definitely a chance to look into the
body
of the notification to see whether the edit actually changed it? (presuming either that there is actually any truncation of what's given there, or a windowing heuristic is applicable?)actual modification?
not for nothing, and I'm sure there are relevant limitations and/or overhead but I do notice that
UNMutableNotificationContent
is a thingas well as this whole system(s): https://developer.apple.com/documentation/usernotifications/modifying-content-in-newly-delivered-notifications / https://developer.apple.com/documentation/usernotifications/unnotificationcontent/updating(from:)
so it seems possible that in place modification is not out of the question?
Footnotes
outside of code style preference for alphabetical order iff its equivalent ↩
They might also have known it to be irrelevant due to other features of previous versions of the code since removed. I didn't find those, but I didn't do an exhaustive search and commit messages perhaps hint so ↩
though e.g.
github.com/blob/${fullsha}~1/Signal/…
is a valid URL, it apparently fouls whatever regex or parser that grants inline-embed rendering to permalinks ( and as must be pretty strictly/[0-9a-f]{n,m}/
as it would not be tricked by/tolerate urlencodes [^
=%5e
or~
=%7e
] even with tricksy commensurate reduction of sha prefix-length). Granted idk that the renderer wouldn't correct back to showing3123842
even if it did work per these urls ↩whereas my friends are mostly phone-always-on-silent types, I hope, and have yet to comfort myself with tests or more code reading, that "silence" extends to vibration/haptics and e.g. popping up on an apple watch, vibrating wrist. ↩