Added `map-like` and `filter-like` to sets. #1217

ess476 · 2017-10-04T15:57:19Z

This adds the features requested in #1179. It also fixes a spelling mistake in the in the sets testing code.

…library.

jpolitz · 2017-11-03T22:40:15Z

Thanks! These look like they'd work, but we have a few comments:

Even though @shriram suggested these names, I don't really get them. What are they -like? I understand the motivation that there's semantic confusion with the properties of map (e.g. length might change) but I'm not sure I understand this new name. I'd prefer map plus documentation to this name.
There should be a separate implementation for tree-sets and for list-sets. The current implementation is a bit of the worst of both worlds. It's quadratic on list-sets because of the re-insertion, where it could be linear just by mapping over the underlying elements and re-constructing a list-set directly. The tree-set implementation isn't asymptotically bad with this strategy, but if it were implemented with a fold over the elements (using, say fold-preorder), it would save many allocations and intermediate steps.
Ideally, these should be implemented as methods first on the individual variants, and then the functions should just call those methods. This avoids having functionality duplicated in those two places, and gives the correct dynamic dispatch to the variants' methods.

shriram · 2017-11-03T23:15:32Z

Mmph. Okay, fine. I'll yield on the -like parts.

ess476 · 2017-11-04T03:54:47Z

Thanks for the feedback!

I'll try to implement the changes you suggested.

A quick question, with regard to point 2, were you saying that for list-sets map or filter (or both) can be linear?

If you're referring to map, since sets need to ensure that duplicates are never added how could any duplicates be eliminated in linear time?

Thanks,

-Ethan

sorawee · 2017-11-04T04:20:21Z

Hmm. If we want to be efficient, for filter in list-set, I would write:

method filter(self, f) -> Set:
  list-set(self.to-list().filter(f))
end

.to-list() does not create a new list. It just returns an underlying list in the list-set back. .filter for list is more efficient than .fold in set since it's implement in JS. Then we use list-set which can be constructed in constant time. We can use list-set right away because of the precondition that .to-list() has unique elements, and .filter preserves that property.

For map in list-set, yes, the worst case is quadratic. Consider an input like this:

N = 100000
S = list-to-list-set(range(0, N))
fun f(x):
  if (x < (N / 2)) or (x == N - 1):
    0
  else:
    x
  end
end

T = S.map(f)

We would then have:

0, 1, 2, 3, ..., N/2 - 1, N/2, N/2 + 1, ..., N - 1, N
=maps=>
0, 0, 0, 0, ..., 0,         N/2, N/2 + 1, ..., N - 1, 0

There are N/2 of 0, and to remove duplicate 0, it needs to walk N/2 elements.

(Even though we use fold to flip the order of traversal or something like that, it is possible to construct a similar worst case).

… filter.

ess476 · 2017-11-04T05:15:47Z

That makes a lot of sense. I guess I just didn't think about the speed of a filter implemented by fold vs the JS version.

Thanks for the help!

ess476 · 2017-11-04T06:07:22Z

Also if the JS implementations provided a significant speed up, I assume the original implementation of list-set map:

method map-like(self, f) -> Set:
  list-to-list-set(self.to-list().map(f))
end

would be faster than:

method map-like(self, f) -> Set:
  self.fold(lam(acc, x): acc.add(f(x)) end, list-set(empty))
end

sorawee · 2017-11-04T06:52:28Z

The reason why my suggestion for filter is objectively OK is that both ways allocate new memory cells in the same amount, so we ought to choose the one that is more efficient.

The issue with

method map-like(self, f) -> Set:
  list-to-list-set(self.to-list().map(f))
end

is that it allocates n new memory cells from .map, and possibly another n new cells from list-to-list-set. While

method map-like(self, f) -> Set:
  self.fold(lam(acc, x): acc.add(f(x)) end, list-set(empty))
end

would allocate at most only n new memory cells from .add in .fold.

That said, I didn't mean to say which one is better, as I don't know if it's worth getting rid of intermediate structures or not, compared to using raw JS functions. Probably worth asking @jpolitz.

schanzer · 2020-08-12T13:08:48Z

@ess476 I'm assuming this is no longer on your radar?

ess476 added 2 commits October 3, 2017 03:15

Added map-like and filter-like methods and functions to the sets …

e30d612

…library.

Replaced tabs with 2 spaces.

777b77d

Updated map and filter to be more efficient.

cea9bfb

Converted list-set's filter to use list-set and list's version of…

5a12211

… filter.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added `map-like` and `filter-like` to sets. #1217

Added `map-like` and `filter-like` to sets. #1217

ess476 commented Oct 4, 2017

jpolitz commented Nov 3, 2017

shriram commented Nov 3, 2017 via email

ess476 commented Nov 4, 2017

sorawee commented Nov 4, 2017 •

edited

ess476 commented Nov 4, 2017

ess476 commented Nov 4, 2017

sorawee commented Nov 4, 2017

schanzer commented Aug 12, 2020

Added map-like and filter-like to sets. #1217

Are you sure you want to change the base?

Added map-like and filter-like to sets. #1217

Conversation

ess476 commented Oct 4, 2017

jpolitz commented Nov 3, 2017

shriram commented Nov 3, 2017 via email

ess476 commented Nov 4, 2017

sorawee commented Nov 4, 2017 • edited

ess476 commented Nov 4, 2017

ess476 commented Nov 4, 2017

sorawee commented Nov 4, 2017

schanzer commented Aug 12, 2020

Added `map-like` and `filter-like` to sets. #1217

Added `map-like` and `filter-like` to sets. #1217

sorawee commented Nov 4, 2017 •

edited