Add subtract to IterableExtensions #306

ebraminio · 2023-08-15T16:34:15Z

Considering views this stackoverflow link has had, https://stackoverflow.com/q/3428536 (637k times atm) I think we can agree list subtract is a popular need for developers. It can have bad implementations like list(set(x) - set(y)) which results to removal of duplicated items or O(m * n) implementations like this https://stackoverflow.com/a/20259489 so as it's easy to write an incorrect implementation I believe we can implement a correct and definite implementation here. I went through mistakes of having incorrect implementations and ruining some app releases with the mistakes and now I like to share my implementation here so others don't go through issues I had.

natebosch · 2023-08-16T00:45:21Z

lib/src/list_extensions.dart

@@ -286,6 +287,29 @@ extension ListExtensions<E> on List<E> {
      yield slice(i, min(i + length, this.length));
    }
  }
+
+  /// Returns a new list with the elements of [other] are removed from `this`.


[nit] This returns an Iterable, not a list. Also the phrasing of "are removed" phrasing is awkward.

Maybe

/// Returns an iterable of the elements in this list that are not present in [other].

Applied. I wanted to avoid 'present' as that somehow gives me the feeling that an occurrence in [other] results in removal of all same items in the result which deliberately isn't the case here (as we are aware about count and filter thing the amount they present in [other]) but I don't know to express that.

Also is it better to rename [other] to [minuend]? I renamed the argument to [other] to make it more compatible with Set's difference? I even once renamed the function to difference but felt that's can be a mistake given difference is defined in the context of sets.

natebosch · 2023-08-16T00:46:55Z

lib/src/list_extensions.dart

+  ///
+  /// It's aware about occurrence count and removes things only the amount
+  /// they are present in [other].
+  Iterable<E> subtract(List<E> other) sync* {


I think we should document the assumptions we are making about E. I think mainly that it is is well behaved as a key in a HashMap (has a consistent hashCode and equals).

Done I think

As the current code convention of the library.

ebraminio · 2023-08-16T08:36:03Z

As this output an iterable and can work as an iterable extension, I move moved it to IterableExtension, or maybe I should revert that, or I will apply any other combination you would suggest.

As it solely acts on iterable and doesn't need lists.

lrhn · 2023-08-17T13:23:27Z

Considering views this stackoverflow link has had, https://stackoverflow.com/q/3428536 (637k times atm) I think we can agree list subtract is a popular need for developers.

I don't always agree that developers know what they need. What they want isn't always what they need.

This operation is inherently two operations:

Count occurrences in the second operand
filter-with-count on the first operand.

I'd prefer to provide both operations, separately

extension <E> on Iterable<E> {
  /// Counts the elements of this iterable, and updates a map with the result.
  ///
  /// Creates a new map, or starts with [accumulator] if provided.
  /// For each element of this iterable, the current value of the element in the map, with a default of zero if
  /// the element doesn't have an entyr in the map, is increased by one.
  /// Returns the updated map.
  ///
  /// If no map is provided, the default map is a default Dart hash-based map using [Object.==] for equality.
  /// That means that distinct-but-equal objects are only represented by one of their elements in the resulting
  /// map.
  /// Provide a custom map if a different equality is needed.
  Map<E, int> countOccurrences([Map<E, int>? accumulator]) {
     accumulator ??= {};
     for (var element in this) accumulator.update(e, _inc, ifAbsent: _one);
     return accumulator;
  }

  static int _inc(int v) => v + 1;
  static int _one() => 1;

  /// Elements of this iterable, skipping elements while they have positive [removeCount].
  ///
  /// Iterating the returned iterable will iterate this iterable and emit the same elements, except that 
  /// for elements that have a positive value in the [removeCount] map, the element is skipped and the
  /// value decremented in the map. Non positive values are removed from the map.
  ///
  /// After iteration, the updated map may still contain counts for elements where there weren't
  /// as many elements in this iterable as could have been skipped.
  ///
  /// The elements of this iterable are looked up in the provided map using normal map operations,
  /// and therefore uses the same equality as the map.
  Iterable<E> removeCounted(Map<E, int> removeCount) sync* {
    for (var element in this) {
      var count = removeCount[element];
      if (count != null) {
        count -= 1;
        if (count > 0) {
          removeCount[element] = count;
        } else {
          removeCount.remove(element);
        }
        if (count >= 0) continue;
      }
      yield element;
    }
  }
}

Then we can also add an

  Iterable<E> removeCountedOccurrences(Iterable<E> remove, [Map<E, int>? accumulator]) =>
     removeCounted(remove.countOccurrences(accumulator));

which does both things for you, but it's understandable as the combination of those two separate and indpendently useful operations, it doesn't hide that it creates a map and uses it.

It feels wrong to me to have an operation that treats its Iterable input as a multiset, and always converts it to one, and doesn't just ask for a multiset as argument. An iterable is-not a multiset, it has horrible performance if you treat it as one directly, which is also why the first step here is to convert it to one.

Being able to see the remove-count being decremented, and see how much was not used, might actually be useful in some cases. Or controlling the equality used may also be useful, which is only possible if you admit that there is a Map involved.

lrhn · 2023-08-16T16:49:34Z

lib/src/iterable_extensions.dart

+        elementsCount.remove(element);
+      } else {
+        elementsCount[element] = count - 1;
+      }


A switch with patterns can be shorter:

switch (elementsCount[element]) { case null: yield element; case > 1 && var count: elementsCount[element] = count - 1; case == 1: elementsCount.remove(element); }

ebraminio · 2023-08-17T20:44:39Z

This operation is inherently two operations:

Count occurrences in the second operand

filter-with-count on the first operand.

[..] it doesn't hide that it creates a map and uses it.

I agree with you, I just am wondering maybe count occurrence part can be shaped around ideas of python's Counter interface (a dedicated class instead of Map<T, int>) if that is not something we want here I think I can continue with your already provided code if you don't want to push it yourself.

ebraminio force-pushed the list-subtract branch from 374ef24 to 50aa602 Compare August 15, 2023 19:37

ebraminio changed the title ~~Add subtract to ListExtensions~~ Add difference to ListExtensions Aug 15, 2023

ebraminio force-pushed the list-subtract branch from 50aa602 to ce9ab4b Compare August 15, 2023 19:52

ebraminio changed the title ~~Add difference to ListExtensions~~ Add subtract to ListExtensions Aug 15, 2023

ebraminio force-pushed the list-subtract branch from ce9ab4b to 89b7f41 Compare August 15, 2023 19:57

Add subtract to ListExtensions

6ae827a

ebraminio force-pushed the list-subtract branch from 89b7f41 to 6ae827a Compare August 15, 2023 21:24

natebosch reviewed Aug 16, 2023

View reviewed changes

natebosch requested a review from lrhn August 16, 2023 00:47

ebraminio added 5 commits August 16, 2023 10:06

Address reviews, minor docs tweak

5d8cc76

Use identity instead of (e) => e

d23e5c5

Clarify groupFoldBy use

2d9f1a4

Use var instead of final

798f765

As the current code convention of the library.

Remove one layer of nested if condition

2ace49d

ebraminio changed the title ~~Add subtract to ListExtensions~~ Add subtract to IterableExtensions Aug 16, 2023

ebraminio force-pushed the list-subtract branch from 72431c0 to 8137589 Compare August 16, 2023 10:42

Move subtract to IterableExtension

5b0a81b

As it solely acts on iterable and doesn't need lists.

ebraminio force-pushed the list-subtract branch 2 times, most recently from 2813cf0 to fd958d5 Compare August 17, 2023 08:31

Simplify elements count in subtract

d861861

ebraminio force-pushed the list-subtract branch from fd958d5 to d861861 Compare August 17, 2023 11:54

lrhn reviewed Aug 17, 2023

View reviewed changes

ebraminio mentioned this pull request Aug 20, 2023

Add Iterable.countOccurrences #307

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add subtract to IterableExtensions #306

Add subtract to IterableExtensions #306

ebraminio commented Aug 15, 2023 •

edited

natebosch Aug 16, 2023

ebraminio Aug 16, 2023 •

edited

natebosch Aug 16, 2023

ebraminio Aug 16, 2023

ebraminio commented Aug 16, 2023 •

edited

lrhn commented Aug 17, 2023

lrhn Aug 16, 2023

ebraminio commented Aug 17, 2023 •

edited

Add subtract to IterableExtensions #306

Are you sure you want to change the base?

Add subtract to IterableExtensions #306

Conversation

ebraminio commented Aug 15, 2023 • edited

natebosch Aug 16, 2023

Choose a reason for hiding this comment

ebraminio Aug 16, 2023 • edited

Choose a reason for hiding this comment

natebosch Aug 16, 2023

Choose a reason for hiding this comment

ebraminio Aug 16, 2023

Choose a reason for hiding this comment

ebraminio commented Aug 16, 2023 • edited

lrhn commented Aug 17, 2023

lrhn Aug 16, 2023

Choose a reason for hiding this comment

ebraminio commented Aug 17, 2023 • edited

ebraminio commented Aug 15, 2023 •

edited

ebraminio Aug 16, 2023 •

edited

ebraminio commented Aug 16, 2023 •

edited

ebraminio commented Aug 17, 2023 •

edited