Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

autocomplete: Sort user-mention autocomplete results #608

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

sm-sayedi
Copy link
Collaborator

@sm-sayedi sm-sayedi commented Apr 2, 2024

In @-mention autocomplete users with whom recent DMs are exchanged are given priority.

Other criteria will be added in the following PRs.

Fixes part of: #228

@sm-sayedi sm-sayedi force-pushed the issue-228-sort-user-mention-results branch from 20fefe8 to cefaf97 Compare April 2, 2024 01:46
@sm-sayedi
Copy link
Collaborator Author

sm-sayedi commented Apr 2, 2024

The code may still have some bugs, and it is unclean, but it would be great to know if I am on the right path for this problem!

And I will be more than happy to have feedback on my draft proposal.

@sm-sayedi sm-sayedi force-pushed the issue-228-sort-user-mention-results branch from cefaf97 to 7141342 Compare April 2, 2024 01:53
Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Here's some high-level comments.

And I will be more than happy to have feedback on my draft proposal.

Ah indeed. 🙂 Replied there.

lib/model/store.dart Outdated Show resolved Hide resolved
lib/model/store.dart Outdated Show resolved Hide resolved
lib/model/store.dart Outdated Show resolved Hide resolved
lib/model/store.dart Outdated Show resolved Hide resolved
lib/model/autocomplete.dart Outdated Show resolved Hide resolved
lib/model/autocomplete.dart Outdated Show resolved Hide resolved
@sm-sayedi sm-sayedi force-pushed the issue-228-sort-user-mention-results branch from 7141342 to b419e14 Compare April 4, 2024 17:17
@sm-sayedi
Copy link
Collaborator Author

Thanks for the review! Revision pushed, but this time changed the whole implementation to almost match it with the Web app.

lib/model/autocomplete.dart Outdated Show resolved Hide resolved
Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Comments below.

lib/model/autocomplete.dart Show resolved Hide resolved
lib/model/autocomplete.dart Outdated Show resolved Hide resolved
lib/model/autocomplete.dart Outdated Show resolved Hide resolved
lib/model/autocomplete.dart Outdated Show resolved Hide resolved
lib/model/recent_dm_conversations.dart Outdated Show resolved Hide resolved
lib/model/recent_dm_conversations.dart Outdated Show resolved Hide resolved
lib/model/recent_dm_conversations.dart Outdated Show resolved Hide resolved
lib/model/store.dart Outdated Show resolved Hide resolved
@sm-sayedi
Copy link
Collaborator Author

Thanks for the review! Pushed the changes.

When you think it's ready, please let me know so I proceed with writing tests.

@sm-sayedi sm-sayedi force-pushed the issue-228-sort-user-mention-results branch from 981de92 to 22953d3 Compare April 5, 2024 07:48
lib/model/autocomplete.dart Outdated Show resolved Hide resolved
Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

This strategy looks good; see comments below. I didn't cover all the points that I might comment on if doing a full review, but I tried to cover all the data structures and algorithms — I think all the data structures and algorithms here are basically what we want modulo these comments.

I'll be on vacation after today, so @chrisbobbe will review the next few revisions. There's a significant amount of complexity here, so he'll probably have plenty of comments to make about code style, dartdoc, tests, potentially also some of the data structures and algorithms, and likely also some correctness details I'd missed. He might merge the PR while I'm away, if he feels it's ready, and might flag some questions he'd like my input on. (Potentially both — we can always revise the code later.)

In any case I'll look forward to seeing when I'm back a version of this PR that's been polished up and is either merged, or nearly ready to merge 🙂

lib/model/autocomplete.dart Outdated Show resolved Hide resolved
lib/model/autocomplete.dart Outdated Show resolved Hide resolved
lib/model/autocomplete.dart Outdated Show resolved Hide resolved
lib/model/recent_dm_conversations.dart Outdated Show resolved Hide resolved
lib/model/recent_dm_conversations.dart Outdated Show resolved Hide resolved
lib/model/recent_senders.dart Outdated Show resolved Hide resolved
lib/model/autocomplete.dart Outdated Show resolved Hide resolved
lib/model/autocomplete.dart Outdated Show resolved Hide resolved
@chrisbobbe
Copy link
Collaborator

chrisbobbe commented Apr 9, 2024

I'll be on vacation after today, so @chrisbobbe will review the next few revisions.

Yep, happy to review! 🙂 I see this PR is currently marked as a draft, so I'll plan to review it once it's marked as ready for review and you've addressed Greg's feedback above.

@sm-sayedi sm-sayedi force-pushed the issue-228-sort-user-mention-results branch from 22953d3 to 804287b Compare April 16, 2024 19:21
@sm-sayedi sm-sayedi marked this pull request as ready for review April 16, 2024 19:30
@sm-sayedi
Copy link
Collaborator Author

Thanks @gnprice for the review!

Revision pushed with the feedback addressed @chrisbobbe! Ready for your review now!

@sm-sayedi sm-sayedi force-pushed the issue-228-sort-user-mention-results branch from 804287b to 5508672 Compare April 16, 2024 19:46
if (_ids.isEmpty) {
_ids.add(id);
} else {
int i = _ids.possibleIndexOf(id);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using List.indexOf or List.indexWhere which uses the linear search, we could take advantage of the fact this list is sorted and use binary search to find the index.

_ids.add(id);
} else {
int i = _ids.possibleIndexOf(id);
if (i >= 0) { // the [id] already exists, so do not add it.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I preferred this way of checking for element's presence, instead of using List.contains which again uses the linear search.

low = mid + 1;
}
}
return -low - 1;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using -low - 1 instead of just -low will differentiate between the case where the element is found at index 0 and the case where the element is not found, but should be placed at the very start of the list.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's explain this in a code comment. 🙂 It's exactly the kind of thing that any reader will be interested in, and most readers will just be looking at the code, without also reading through this PR thread.

edit: hmm, it's not clear in the GitHub UI I'm seeing right now, but this was meant as a reply to your comment #608 (comment)

chrisbobbe added a commit that referenced this pull request Apr 18, 2024
We're considering adding another `Map` field to
RecentDmConversationsView; see PR #608. If we do, this dartdoc will
want this added explicitness.
Copy link
Collaborator

@chrisbobbe chrisbobbe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @sm-sayedi! Comments below, and I see there's a test failing in CI. Does that failure reproduce when you run tools/check locally?

The new code will also need new tests.

required User userA,
required User userB,
required int streamId,
// TODO(#493): may turn this into a non-nullable string.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the connection with #493?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed it this time! At first I thought, if the compose box is populated with a topic narrow, we may show the user results based on that topic narrow, even if we're in the stream view. But now looking at the web, it's not the case.

Comment on lines 5 to 10
extension Max<T extends num> on Iterable<T> {
/// Finds the maximum number in an [Iterable].
///
/// Returns null if the [Iterable] is empty.
T? max() => isNotEmpty ? reduce(math.max) : null;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to be used. Let's simplify by removing it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting this. It was used in the previous implementation, and I forgot to remove it.

Comment on lines 45 to 46
// The ID of the latest messages exchanged with other users.
final Map<int, int> _dmRecencyData = {};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use dartdoc formatting, like the others (so /// instead of //)

Also, can we find a clearer name for this field? In a class named RecentDmConversationsView, the name _dmRecencyData seems redundant, and it also seems less specific than it should be. 🙂

I think we can make the dartdoc more specific as well. The type (Map<int, int>) doesn't tell us if the map is keyed by message IDs or user IDs, since the key and value types are both int. That would be good to make clearer so the reader doesn't have to search for clues elsewhere, like the places where the field gets used. There are also a few other relevant facts that a reader would naturally wonder about.

So for example:

  /// Map from user ID to the latest message ID in any conversation with the user.
  ///
  /// Both 1:1 and group DM conversations are considered.
  /// The self-user ID is excluded even if there is a self-DM conversation.
  ///
  /// (The identified message was not necessarily sent by the identified user;
  /// it might have been sent by anyone in its conversation.)
  final Map<int, int> _latestMessageByRecipient = {};

What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! This is a well-descriptive dartdoc. Added it.

Comment on lines 48 to 51
/// The ID of the latest message exchanged with this user.
///
/// Returns -1 if there has been no DM message exchanged ever.
int getRecencyIndex(final int userId) => _dmRecencyData[userId] ?? -1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we pull the ?? -1 part out to callers? It seems helpful toward simplifying this function's job, which is already somewhat complex to describe accurately (see my previous comment on _dmRecencyData).

Then, I think it would be helpful to change the name to something more transparent. So for example:

  /// The latest message ID in any conversation with the given user, if any.
  ///
  /// Both 1:1 and group DM conversations are considered.
  /// Gives null for the self-user ID even if there is a self-DM conversation.
  ///
  /// (An identified message was not necessarily sent by the given user;
  /// it might have been sent by anyone in its conversation.)
  int? latestMessageWithRecipient(final int userId) => _latestMessageByRecipient[userId];

idTracker.add(messageId);
}

void processStreamMessage(Message message) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name makes it seem like a StreamMessage is specifically expected, but the param's type is the more general type Message.

I first noticed this when look at this method's callers, which look odd because they're passing messages that are not known to be StreamMessages.

Could fix by renaming to processMessage, or by narrowing the param's type to StreamMessage instead of Message.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add a dartdoc for this method, to make it clear what callers are signing up for when they call it. Maybe we can also find a more specific verb than "process"; I see it calls some helpers addStreamMessage and addTopicMessage; perhaps this should be named processMessage?

Copy link
Collaborator Author

@sm-sayedi sm-sayedi Apr 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could fix by renaming to processMessage, or by narrowing the param's type to StreamMessage instead of Message.

Went with processAsStreamMessage. I thought when using processMessage, one would think that EACH message is processed; and when narrowing the param's type to StreamMessage, each caller should always make sure they're passing the correct message type by using a check, for example:

if (message is StreamMessage) {
  recentSenders.processStreamMessage(message);
}

I think it would be cleaner to add this check in the method itself and use a somewhat descriptive method name.
What do you think?

Maybe we can also find a more specific verb than "process".

I think the verb "process" fits good in here as it extracts some data from the message and keeps track of it in a data structure. But I would appreciate a better verb from your side!

Edited: Renamed it to handleMessage to match it with handleMessageEvent declared in model\recent_dm_conversations.dart

low = mid + 1;
}
}
return -low - 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's explain this in a code comment. 🙂 It's exactly the kind of thing that any reader will be interested in, and most readers will just be looking at the code, without also reading through this PR thread.

edit: hmm, it's not clear in the GitHub UI I'm seeing right now, but this was meant as a reply to your comment #608 (comment)

Comment on lines 51 to 52
/// A data structure to keep track of message ids.
class IdTracker {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think MessageIdTracker would be a more helpful name for this, since a lot of the nearby code is dealing with user IDs. 🙂

With that name, the dartdoc as written would stand out as redundant/useless, though, wouldn't it…but that could be dealt with by removing it. But if there's anything else that's interesting about this class (other than that it's a data structure that keeps track of message IDs), that could be good to put in the dartdoc.

Comment on lines 261 to 265
if (!userA.isBot && userB.isBot) {
return -1;
} else if (userA.isBot && !userB.isBot) {
return 1;
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part doesn't seem like it belongs in a method named compareByDms, as Greg pointed out above: #608 (comment)

Comment on lines 436 to 427
final Map<String, String> _namesLowercased = {};

String nameLowercased(String name) {
return _namesLowercased[name] ??= name.toLowerCase();
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to key this by user ID, like _nameWordsByUser is? (If doing that, we should make sure invalidateUser acts appropriately on the lowercased-name data.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about _lowercasedNameByUser and lowercasedNameForUser, on the pattern of _nameWordsByUser and nameWordsForUser?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason not to key this by user ID, like _nameWordsByUser is?

Two reasons: a) to not pass the whole User object while we can do the work with just one of its properties. b) there can be multiple users with the same name so we can save the space a little bit by having just one key-value pair for all those user names.
However, this will bring some complexity when adding the appropriate logic in invalidateUser. Let's assume we somehow pass fullName to 'invalidateUser' besides userId to remove the lowercase name. It will remove the user name for all the users that share the same name.
So I am not sure whether to key it by userId or fullName?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think best to key it by user ID, the same way as _nameWordsByUser.

It's uncommon for users to have the same name, so there's minimal space savings to be had from that, which makes it not worth the complexity of worrying about interactions there.

Relatedly: in nameWordsForUser we're calling toLowerCase on user.fullName, just like this does. So let's have that step share this same cache.

@@ -381,6 +381,7 @@ class MessageListView with ChangeNotifier, _MessageSequence {
for (final message in result.messages) {
if (_messageVisible(message)) {
_addMessage(message);
store.recentSenders.processStreamMessage(message);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you talk a bit about the performance considerations of this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not hundred percent sure if it's an expensive task or not. I think it does relatively simple task of adding IdTracker to a map two times.

@sm-sayedi sm-sayedi force-pushed the issue-228-sort-user-mention-results branch 2 times, most recently from 31deb52 to 6797b79 Compare April 19, 2024 20:44
@sm-sayedi
Copy link
Collaborator Author

sm-sayedi commented Apr 19, 2024

Thanks for the review! Revision pushed with some comments above. Please have a look.

The tests are failing locally for me too. I was waiting for the overall approach of this feature to be steady to start writing tests for it. I will do it after now, but at the same time, I would appreciate your feedback on the recent revision!

Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, here's some more review comments before I move on to other things. Thanks for the revision!

A couple of high-level comments:

  • Let's break this up into a series of commits: adding the recent-senders data, adding the new data structure to recent-DMs, adding a helper like possibleIndexIn or MapNotifier (though those two in particular I think we'll just leave out), and so on.

    Then adding the changes to autocomplete behavior can come at the end. That can also be a series of commits, adding the several different new criteria it uses.

  • In fact let's also break it up into several PRs, because that will help us be able to merge parts of this sooner. Let's start with recent DMs, since that's the simpler of the two new data structures. So a PR that leaves out recent senders, and can leave out is-bot and name too, but has a series of commits that first add the new recent-DMs data (after doing any prep needed for that), then do any needed prep for the concept of _sortedUsers, then starts comparing by recent DMs.

    Then the other sort criteria can be in followup PRs.

check(mention(user, silent: true, users: store.users)).equals('@_**Full Name|123**');
check(mention(user, silent: true, users: store.users.value)).equals('@_**Full Name|123**');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a bummer to have to keep saying store.users.value, though, for everything that just wants some data from the map.

Comment on lines 177 to 179
/// This can be listened to for any modifications done to `Map<K, V>`.
class MapNotifier<K, V> extends ValueNotifier<Map<K, V>> {
MapNotifier(super._value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so: if you look at the implementation of sortByRelevance, there's a lot of other data it's looking at there which isn't covered by listening on the users.

For that matter, the User objects themselves can get mutated, and that'll change the sorting, but this notifier won't notice.

… Aha, and you already have logic to handle that mutation:

  void handleRealmUserRemoveEvent(RealmUserRemoveEvent event) {
    for (final view in _mentionAutocompleteViews) {
      view.refreshStaleUserResults();

// …

  void refreshStaleUserResults() {
    if (_query != null) {
      _sortedUsers = null;

So we just need to have that cause the loop to stop, too.

Adding and removing users also reaches that same "refresh" method.

So I think this extra notifier isn't giving us anything, and we can skip it.

Comment on lines 54 to 57
/// Getter for the latest messages by recipient.
///
/// This is intended for testing purposes only.
Map<int, int> get latestMessagesByRecipientForTesting => _latestMessagesByRecipient;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Getter for the latest messages by recipient.
///
/// This is intended for testing purposes only.
Map<int, int> get latestMessagesByRecipientForTesting => _latestMessagesByRecipient;
/// Getter for the latest messages by recipient.
@visibleForTesting
Map<int, int> get latestMessagesByRecipientForTesting => _latestMessagesByRecipient;

😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh also, the convention within Flutter is to give these a name starting with "debug":

Suggested change
/// Getter for the latest messages by recipient.
///
/// This is intended for testing purposes only.
Map<int, int> get latestMessagesByRecipientForTesting => _latestMessagesByRecipient;
/// Getter for the latest messages by recipient.
@visibleForTesting
Map<int, int> get debugLatestMessagesByRecipient => _latestMessagesByRecipient;

(That's a bit broader than "for testing"; but the annotation covers that added specificity.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

… Making all of that moot, though:

Given that we expose map and sorted publicly as a Map and a List, we should just expose this by-recipient map publicly as a Map in the same way. It's fundamentally very much the same kind of data structure.

Comment on lines 33 to 34
});
}) {
_computeLatestMessagesWithUsers();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following up on #608 (comment) : this revision is a lot closer on this aspect, but let's take it all the way. That means taking the logic that's in _computeLatestMessagesWithUsers and putting it in the factory constructor, so that by the time we call the generative constructor we have all the data already built.

That then means that the invariants of how map relates to sorted and to _latestMessagesByRecipient will all hold from the very beginning, rather than having an asterisk where they don't yet hold when the object is first created.

Comment on lines 69 to 73
void _updateLatestMessagesWithUsers(List<int> userIds, int messageId) {
for (final recipient in userIds) {
final latestMessageId = _latestMessagesByRecipient[recipient] ?? -1;
if (messageId > latestMessageId) {
_latestMessagesByRecipient[recipient] = messageId;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly to putting the contents of _computeLatestMessagesWithUsers inside the factory constructor, let's inline this logic into handleMessageEvent.

In general, the idea is that _latestMessagesByRecipient is just another data structure that's part of this class, in the same way as map and sorted, and we compute it and keep it updated on equal terms to those two.

Comment on lines 162 to 164
if (aMessageId != bMessageId) {
return bMessageId - aMessageId;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using subtraction for this, on ints, is buggy — it will get the wrong answer when the values are very far apart.

Try running this test case to see:

    test('tmp', () {
      const int a = 0x7fffffffffffffff;
      const int b = -a - 1;
      print("a: $a\nb: $b\na-b: ${a-b}");
      check(a.compareTo(b) < 0).equals(a < b); // pass
      check((a - b) < 0).equals(a < b);        // FAIL
    });

As the test illustrates, a correct way to do this is with [int.compareTo] (which implements [Comparable.compareTo]). You could also do it with any two of < and == and <= (or inverses like >= etc.), but compareTo is a bit cleaner.

So e.g.:

Suggested change
if (aMessageId != bMessageId) {
return bMessageId - aMessageId;
}
final result = aMessageId.compareTo(bMessageId);
if (result != 0) return result;

Comment on lines 82 to 84
setUp(() {
processMessages();
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because setUp runs implicitly, without any mention of it at the individual test case, it makes it harder to see just what the test is actually doing and relying on.

One example is that when one looks at this and then at processMessages just above, one might be concerned that recentSenders gets messages added to it again and again but never seems to get cleared.

What's actually going on is that there's another setUp, some way above in the file, which resets recentSenders to a fresh instance. But that's left entirely implicit here.

Instead, have each test case set up what it needs explicitly. This can be kept very concise by having it just make a one-line call to a shared helper — for examples, see the helpers called prepare in many of our test files, like test/model/unreads_test.dart.

Comment on lines 106 to 108
final priorityInStream = recentSenders.compareByRecency(
userA: userA,
userB: userC,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines are kind of confusing to read 🙃

Let's make the two arguments being compared be positional, rather than named. I think that's just as clear an interface or more so, and it'll avoid this sort of renaming and also be more terse.

Comment on lines 95 to 101
final priorityInTopic2 = recentSenders.compareByRecency(
userA: userA,
userB: userB,
streamId: stream.streamId,
topic: topic2,
);
check(priorityInTopic2).isGreaterThan(0); // [userB] is more recent in topic2.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feeling the need for this kind of comment is a good sign that these test cases would benefit from being made more compact, and bringing the key data the test is acting on closer to the point where it's saying what answer is expected.

One good step would be to bring each case like this down to taking two or three lines (or can we even make it just one line?), instead of 7 lines. When each test case is more compact vertically, it gets easier to have more of them without things getting unreadable.

Another is that instead of stating a single shared set of data up at the top of the group, you could have each test provide its own data. This can be an argument to a prepare helper; again see examples in existing test files. The exact signature of the prepare helper is up to you, and can be adapted to help keep the tests compact; for example if things are always in the same stream, then the individual tests can skip mentioning that.

Comment on lines 38 to 40
..sorted.deepEquals([key([1, 2]), key([]), key([1])]);
..sorted.deepEquals([key([1, 2]), key([]), key([1])])
..latestMessagesByRecipientForTesting.deepEquals({1: 300, 2: 300});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good to update the existing test cases for this code to check the new data structure.

But are there any new test cases that become useful based on thinking through edge cases of the new data structure? I'd guess there would be. It'd then be good to add those too.

Copy link
Collaborator Author

@sm-sayedi sm-sayedi May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought a lot about it, but frankly, I couldn't come up with other edge cases!🙂 I would appreciate any clue from your side!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One useful strategy for this is to look through your changes; try various plausible ways the code could have been buggy; and make sure that for every one of those, there's a test that would fail.

Here's one: in handleMessageEvent, if the ifAbsent were forgotten:

         (latestMessageId) =>
             message.id > latestMessageId ? message.id : latestMessageId,
-        ifAbsent: () => message.id,
+        // ifAbsent: () => message.id,
       );

then in this revision all the tests still pass. So there should be a new test case that covers that, in the sense that it fails when you try that change.

@@ -381,6 +381,7 @@ class MessageListView with ChangeNotifier, _MessageSequence {
for (final message in result.messages) {
if (_messageVisible(message)) {
_addMessage(message);
store.recentSenders.handleMessage(message);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ideal version of this would go on top of a central message store, #648, so that MessageListView wouldn't have to know individually about all the miscellaneous data structures that care about a summary of messages.

This PR doesn't need to block on that one; for now, it's fine to add these calls in a more ad-hoc way like this.

The current version of this PR doesn't call this RecentSenders.handleMessage method in enough cases, though. Compare which messages get passed to reconcileMessages in #648.

@sm-sayedi sm-sayedi force-pushed the issue-228-sort-user-mention-results branch 3 times, most recently from a5c1062 to bfb2d1d Compare May 14, 2024 09:56
@sm-sayedi
Copy link
Collaborator Author

Thanks @gnprice for the detailed review! Pushed the revision with some comments here and there. PTAL.

Note: I divided the giant commit into multiple commits instead of opening several PRs for related features to avoid becoming blocked by initial PRs as the latter would rely on them. Again, if you prefer the multiple PRs approach, please let me know, and I will do that.

@@ -145,6 +145,18 @@ class AutocompleteViewManager {
autocompleteDataCache.invalidateUser(event.userId);
}

void handleMessageEvent(MessageEvent event) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if it is necessary to pass the event parameter as it is not used for now. I just followed the code for handleRealmUserAddEvent(RealmUserAddEvent event) method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, following that pattern is good. This is also the pattern you'll find at the call site in PerAccountStore.handleEvent.

Comment on lines 149 to 151
for (final view in _mentionAutocompleteViews) {
view.refreshStaleUserResults();
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be good to extract this bit to a (private) method as it is used in several other neighboring methods?! A name like refreshAutocompleteResults may be a proper name for that method as then we can replace it with all the method calls that just have the body of this method in their bodies.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, a prep commit to factor this out would be good. A good name would be _refreshStaleUserResults, to keep it more specific. (Right now the only kind of autocomplete we have is to return users, but we'll also have autocomplete for streams and topics and other things, and those won't necessarily need to get invalidated when the list of users or their ranking changes.)

@@ -381,6 +381,7 @@ class MessageListView with ChangeNotifier, _MessageSequence {
for (final message in result.messages) {
if (_messageVisible(message)) {
_addMessage(message);
store.recentSenders.handleMessage(message);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure whether to add it inside the if block or outside!!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question.

What does the dartdoc on that _messageVisible method say?

Based on that, what would be the consequence of having that condition control whether to make this call? And what would be the consequence of ignoring that condition when making this call?

@sm-sayedi sm-sayedi requested a review from gnprice May 14, 2024 10:30
Comment on lines 29 to 30
if (!msgsByUser.containsKey(userId)) {
msgsByUser[userId] = map[dmNarrow]!;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be simplified using Map.putIfAbsent.

Comment on lines 25 to 30
final msgsByUser = <int, int>{};
for (final dmNarrow in sorted) {
for (final userId in dmNarrow.otherRecipientIds) {
// only take the latest message of a user among all the conversations.
if (!msgsByUser.containsKey(userId)) {
msgsByUser[userId] = map[dmNarrow]!;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than starting with sorted and looking up in map, this can iterate through entries. That should be a bit more efficient (by a constant factor), since it just iterates through an array without looking up in a hash table.

Comment on lines 153 to 154
(latestMessageId) =>
message.id > latestMessageId ? message.id : latestMessageId,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use max to tighten this code:

Suggested change
(latestMessageId) =>
message.id > latestMessageId ? message.id : latestMessageId,
(latestMessageId) => max(message.id, latestMessageId),

Comment on lines 52 to 58
/// Whether stream senders and topic senders are both empty.
@visibleForTesting
bool get debugIsEmpty => _streamSenders.isEmpty && _topicSenders.isEmpty;

/// Whether stream senders and topic senders are both not empty.
@visibleForTesting
bool get debugIsNotEmpty => _streamSenders.isNotEmpty && _topicSenders.isNotEmpty;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this makes the "is not empty" method not equivalent to ! of the "is empty" method. That seems unexpected.

Let's keep just the "is empty" version, and callers can invert it themselves. I think the test will be fine without the added specificity of this debugIsNotEmpty — we need the data structure to have the right data in it, not just to be nonempty, and so we need test cases that go further into detail anyway.

Comment on lines 141 to 142
final aMessageId = _latestMessageIdOfSenderInTopic(
streamId: streamId, topic: topic, senderId: userA.userId);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: two-space indents

Comment on lines 9 to 11
Subject<Map<int, int>> get latestMessagesByRecipient => has(
(v) => v.latestMessagesByRecipient,
'latestMessagesByRecipientForTesting');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name needs updating

Comment on lines 78 to 80
/// The activity is first looked for in [topic] then in [stream].
///
/// Returns a negative number if [userA] has more recent activity,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: trailing whitespace:

Suggested change
/// The activity is first looked for in [topic] then in [stream].
///
/// Returns a negative number if [userA] has more recent activity,
/// The activity is first looked for in [topic] then in [stream].
///
/// Returns a negative number if [userA] has more recent activity,

This should stand out in red when you look at git diff, or git log -p:
image

Comment on lines 97 to 98
final priorityInTopic1 = priority(topic: topic1);
check(priorityInTopic1).isGreaterThan(0); // [userB] is more recent in topic1.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

Suggested change
final priorityInTopic1 = priority(topic: topic1);
check(priorityInTopic1).isGreaterThan(0); // [userB] is more recent in topic1.
check(compareAB(topic: topic1)).isGreaterThan(0);

Then I think this line is fairly self-contained in its meaning: user A compares after user B, in topic topic1.

Comment on lines 93 to 96
final userAMessage = eg.streamMessage(sender: userA, stream: stream, topic: topic1);
final userBMessage = eg.streamMessage(sender: userB, stream: stream, topic: topic1);
prepare();
fillWithMessages([userAMessage, userBMessage]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

Suggested change
final userAMessage = eg.streamMessage(sender: userA, stream: stream, topic: topic1);
final userBMessage = eg.streamMessage(sender: userB, stream: stream, topic: topic1);
prepare();
fillWithMessages([userAMessage, userBMessage]);
prepare();
addMessage(userA, topic1);
addMessage(userB, topic1);

In particular that gets the lines down to a reasonable length. A crucial part of the test is the topics in those eg.streamMessage calls, so that information needs to be within 80 columns.

Comment on lines 38 to 40
..sorted.deepEquals([key([1, 2]), key([]), key([1])]);
..sorted.deepEquals([key([1, 2]), key([]), key([1])])
..latestMessagesByRecipientForTesting.deepEquals({1: 300, 2: 300});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One useful strategy for this is to look through your changes; try various plausible ways the code could have been buggy; and make sure that for every one of those, there's a test that would fail.

Here's one: in handleMessageEvent, if the ifAbsent were forgotten:

         (latestMessageId) =>
             message.id > latestMessageId ? message.id : latestMessageId,
-        ifAbsent: () => message.id,
+        // ifAbsent: () => message.id,
       );

then in this revision all the tests still pass. So there should be a new test case that covers that, in the sense that it fails when you try that change.

Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the revision! Some comments above. Here, replies to your inline questions.

@@ -145,6 +145,18 @@ class AutocompleteViewManager {
autocompleteDataCache.invalidateUser(event.userId);
}

void handleMessageEvent(MessageEvent event) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, following that pattern is good. This is also the pattern you'll find at the call site in PerAccountStore.handleEvent.

Comment on lines 149 to 151
for (final view in _mentionAutocompleteViews) {
view.refreshStaleUserResults();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, a prep commit to factor this out would be good. A good name would be _refreshStaleUserResults, to keep it more specific. (Right now the only kind of autocomplete we have is to return users, but we'll also have autocomplete for streams and topics and other things, and those won't necessarily need to get invalidated when the list of users or their ranking changes.)

@@ -381,6 +381,7 @@ class MessageListView with ChangeNotifier, _MessageSequence {
for (final message in result.messages) {
if (_messageVisible(message)) {
_addMessage(message);
store.recentSenders.handleMessage(message);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question.

What does the dartdoc on that _messageVisible method say?

Based on that, what would be the consequence of having that condition control whether to make this call? And what would be the consequence of ignoring that condition when making this call?

Copy link
Member

@gnprice gnprice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, and here's the rest of a review. See also the couple of groups of comments just above.

Splitting the commit up into 10 smaller commits was very helpful, thanks!

Note: I divided the giant commit into multiple commits instead of opening several PRs for related features to avoid becoming blocked by initial PRs as the latter would rely on them. Again, if you prefer the multiple PRs approach, please let me know, and I will do that.

Yeah, splitting into several PRs is helpful for two reasons:

  • GitHub makes things increasingly unwieldy with even a moderately-long PR thread. This one is currently 126 comments long (according to https://github.com/zulip/zulip-flutter/pulls), so it's already past the point where when I open the PR page, the first thing I have to do in order to make it usable is Ctrl+F "hidden", click "Load more…", Ctrl+G, click again, Ctrl+G in a loop until there are no matches. The longer the thread gets, the more iterations of that loop are needed.

    The "hidden" thing is a bad and dysfunctional choice by GitHub that I can only explain in terms of messed-up internal incentives in their organization. But it's the way it works, so we have to adapt. (And even without that, a really long thread would still get unwieldy.)

  • I think we'll be ready to merge the "recent DMs" part before the "recent senders" part. So focusing one PR on the "recent DMs" part, and the other pieces needed for that (like _sortedUsers and the logic around it), makes it possible to get that in sooner.

The later PR will block on the earlier PR before it can be merged, but that's not really a change from what would happen with a single PR: the later changes would still have to wait for the earlier changes to be ready before they can be merged.

One workflow bit to make explicit: GitHub doesn't know anything about the relationship between PRs, but that's fine. Just stack the second branch atop the first one in Git, and send it as another PR; the first N commits in the second PR will be the commits in the first PR. Be sure to mention clearly in the second PR's description that it's based atop the first — that helps reviewers focus on the commits that are new in the second PR.

Comment on lines 177 to 179
/// This can be listened to for any modifications done to `Map<K, V>`.
class MapNotifier<K, V> extends ValueNotifier<Map<K, V>> {
MapNotifier(super._value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we'll want that test to pass. That's what this is about:

So we just need to have that cause the loop to stop, too.

To better demonstrate how to do that, I've just pushed a pair of commits to the end of your branch:
46f3723 Revert "store [nfc]: Change users from Map<int, User> to a new wrapper MapNotifier<int, User>"
88f4872 autocomplete: Break search loop when _sortedUsers invalidated

So one reverts the MapNotifier change (made much easier by your having broken it out into a separate commit!), and the other is a small change just within _computeResults. That test fails after the revert (as expected), but passes again after the other commit.

Please take a look at those changes, and then in your branch you can:

  • drop the reverted commit, and the revert; and
  • squash the "Break search loop" change into the commit that introduces _sortedUsers (after splitting that out from adding any particular comparison, as described in my other comment just below).

@@ -251,11 +266,95 @@ class MentionAutocompleteView extends ChangeNotifier {
notifyListeners();
}

List<User>? _sortedUsers;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following up on #608 (review) :

a series of commits that […] do any needed prep for the concept of _sortedUsers, then starts comparing […]

A further improvement to the sequence of commits would be to pull out the changes that set up and maintain this _sortedUsers field, into an NFC commit where no actual sorting happens (i.e. where sortByRelevance doesn't do anything). Then the specific comparisons, even the first one, each come in separate later commits.

One reason that's helpful is that it makes it easier to reorder the commits that add comparisons, for example in order to merge some that are ready while we're still working on others.

Another is that the rest of these mechanics, outside the particular comparison itself, have some pretty subtle aspects. That makes it useful to have them in an isolated commit of their own, the better to read and understand them.

@gnprice
Copy link
Member

gnprice commented May 14, 2024

See also a reply on one more question: #608 (comment)

@gnprice gnprice added the maintainer review PR ready for review by Zulip maintainers label May 14, 2024
@sm-sayedi sm-sayedi force-pushed the issue-228-sort-user-mention-results branch from 88f4872 to 5d6dc03 Compare May 17, 2024 18:22
@sm-sayedi
Copy link
Collaborator Author

Thanks for the review! Pushed the recent DMs revision here although I wasn't sure to create a new PR for this too. Please have a look.

The other criteria will come in the following PRs.

@sm-sayedi sm-sayedi changed the title autocomplete: Sort user-mention autocomplete results autocomplete: In user-mention autocomplete results give priority to users in DM conversations May 18, 2024
@sm-sayedi sm-sayedi requested a review from gnprice May 18, 2024 04:02
sm-sayedi and others added 3 commits May 18, 2024 15:46
This data structure is used to keep track of the latest message of
each recipient in all DM conversations.
This field will be used to maintain a list of sorted users based on the
most relevant autocomplete criteria in the upcoming commits.

Co-authored-by: Greg Price <greg@zulip.com>
@sm-sayedi sm-sayedi force-pushed the issue-228-sort-user-mention-results branch 2 times, most recently from 419aee6 to 9b75e44 Compare May 19, 2024 08:08
In @-mention autocomplete, users are suggested based on:
  1. Recent DM conversations.

Fixes part of: zulip#228
@gnprice
Copy link
Member

gnprice commented May 20, 2024

Thanks for the review! Pushed the recent DMs revision here although I wasn't sure to create a new PR for this too.

Thanks!

Yeah, now that I think about it: please close this PR and open a new one with the same contents. In particular that will be helpful because the last round of review above covers changes that will be part of the several different PRs in the series; by letting this PR thread end here, that review will remain accessible when looking at each of those PRs. If we continued in this same thread as the PR for the first part of the changes, then that review would get pushed up into history and potentially into the "hidden" area.

@sm-sayedi sm-sayedi changed the title autocomplete: In user-mention autocomplete results give priority to users in DM conversations autocomplete: Sort user-mention autocomplete results May 21, 2024
@sm-sayedi
Copy link
Collaborator Author

I have opened #693 as the first part of this PR. Please have a look.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintainer review PR ready for review by Zulip maintainers
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants