Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should map and starmap be renamed? #1887

Closed
budlight opened this issue Feb 23, 2014 · 3 comments
Closed

Should map and starmap be renamed? #1887

budlight opened this issue Feb 23, 2014 · 3 comments

Comments

@budlight
Copy link

I don't really think these 2 primitives really live up to what it seems they should imply in the context of a "distributed task queue". I understand the reason why someone might want the map and starmap functions since they create a single task, but I don't really get the advantage to them as it would be trivial for the user to write the function to simply support a list of inputs. To me the term "map" here implies something like a map-reduce algorithm which this is definitely not. A chord is effectively a form of map-reduce which really just makes the naming boggling.

I think the naming of chord should stick as it makes enough since, but having map and starmap alludes that there is a map-reduce function IMO.

http://docs.python.org/2/library/multiprocessing.html#using-a-pool-of-workers

@ask
Copy link
Contributor

ask commented Nov 6, 2014

group is the distributed map function.

The term map was used long before Google wrote the MapReduce paper, and I don't think anyone is confused by the term when used in haskell or clojure. The functionality is perfectly expressed with the term 'map', and I don't think there are any natural alternatives.

The canvas primitives are also all nouns (signature, group, chord, chain), but map is used as a verb (task.map), not the thing tourists are sometimes seen with.

MapReduce frameworks will also not normally have disconnected map and reduce stages, instead
you have a mapreduce operation that takes a Mapper and a Reducer, where the processed data is streamed into the reducer. In fact, simply having map() and reduce() is not considered to be sufficient for MapReduce.

So a chord is not really a form of map-reduce, it's a distributed version of a barrier, and the name
is directly taken from such a barrier in Cω.

group is not taken from anywhere, but the operation is the same
as what is often called 'parallel map' in the concurrency literature, just in distributed form.
Therefore, map is usually considered to be sequential, not parallel

@ask ask closed this as completed Nov 6, 2014
@ask
Copy link
Contributor

ask commented Nov 6, 2014

And they are useful because they let you decrease the granularity of an operation simply by using task.map(list) instead of group(task.s(i) for i in list)

@tim-schilling
Copy link
Sponsor Contributor

Except that task.map(list) doesn't allow the tasks to run concurrently. All of the tasks run on the same worker one after another. If this shouldn't be the case I can open up a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants