Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] concurrent map? #388

Open
NightMachinery opened this issue Sep 28, 2021 · 8 comments
Open

[FR] concurrent map? #388

NightMachinery opened this issue Sep 28, 2021 · 8 comments
Labels
discussion enhancement Suggestion to improve or extend existing behavior

Comments

@NightMachinery
Copy link

Can we have some functional concurrency primitives, like a concurrent map that runs some function on a sequence concurrently?

Of course, parallelism would be best, but I think emacs threads don't support that?

@basil-conto
Copy link
Collaborator

basil-conto commented Sep 28, 2021

Given that Emacs threads are cooperative, what sort of API/semantics do you envision?

@basil-conto basil-conto added discussion enhancement Suggestion to improve or extend existing behavior labels Sep 28, 2021
@NightMachinery
Copy link
Author

NightMachinery commented Sep 28, 2021

@basil-conto I am mostly a noob on this stuff, and I can't think how we can make a cooperative thread running synchronous code yield. Python's threads are non-cooperative, no? Considering Python also has a global interpreter lock, why couldn't the devs make the threads non-cooperative?

But the primitive I am envisioning does not need to touch the state; So can't we somehow just fork the current process? Sth along these lines:

(defun (n) (some-expensive-computation n))
(parallel-map #'f '(1 2 3))

The above code should fork, and run (f 1), (f 2), (f 3) on separate processes. These processes can write their output to a file path supplied to them by the main process. The main process should poll those file paths and wait until all the processes have finished, then use the values in the files to create a new list.

Of course, this is a very "simple" architecture, and using ZeroMQ sockets etc might be more efficient, but I just thought up the simplest thing I could think of.

The following API might be even be better:

(defun (n) (some-expensive-computation n))
(parallel-map-async #'f '(1 2 3)
  (lambda (new-list) ...))

(I.e., emacs doesn't block when polling the file locations, and will run the supplied lambda with the result when it is available.)

@alphapapa
Copy link

https://elpa.gnu.org/packages/async.html

Having said that, what kind of scenario would this be useful in? i.e. what kind of long-running, CPU-intensive activity would benefit from mapping one-process-per-element of a list?

@NightMachinery
Copy link
Author

https://elpa.gnu.org/packages/async.html

Having said that, what kind of scenario would this be useful in? i.e. what kind of long-running, CPU-intensive activity would benefit from mapping one-process-per-element of a list?

I have a network heavy function in mind; Essentially, I have a list of IDs, and I want to get some remote JSON object from them. Doing this sequentially takes forever.

That async package needs one to load their elisp functions in the subprocesses, which is quite damning, as loading everything takes so much time to make the whole enterprise fruitless, and loading just the elisp one needs, needs heavy refactoring and package management. (It’s not a real fork; It is just starting up new emacs processes from scratch.)

@alphapapa
Copy link

alphapapa commented Sep 28, 2021

I have a network heavy function in mind; Essentially, I have a list of IDs, and I want to get some remote JSON object from them. Doing this sequentially takes forever.

For that case, there are solutions that don't involve multiple Emacs processes, like asynchronous network requests using url queues, request, etc. Similar to, e.g. using Python, it's not necessary to run a new interpreter process, because just the network part can already be done in parallel.

FYI, I collect some info about writing Emacs packages here: https://github.com/alphapapa/emacs-package-dev-handbook

@basil-conto
Copy link
Collaborator

basil-conto commented Sep 28, 2021

like asynchronous network requests using url queues

BTW, asynchronous subprocess I/O like that is one of the well-defined times at which Emacs can switch thread context.

@basil-conto
Copy link
Collaborator

what kind of long-running, CPU-intensive activity would benefit from mapping one-process-per-element of a list?

More likely would be one-process-per-several-elements.

@NightMachinery
Copy link
Author

Feel free to close the issue. I like my own "simple" API much better than learning about arcane library-specific APIs though. It's a general primitive, and easy to learn, and doesn't require rewriting other parts of the code.

Is it even possible to do it, or will forking emacs mess with, idk, its open file handlers or sth? (Why is that async package doing "fake" forks otherwise? Windows compatibility perhaps?)

More likely would be one-process-per-several-elements.

This can easily become a user supplied variable so it doesn't matter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion enhancement Suggestion to improve or extend existing behavior
Projects
None yet
Development

No branches or pull requests

3 participants