Sort input extensions #2705

chrishalcrow · 2024-04-11T12:43:45Z

Automatically sorts a list of extensions, for computation, so that all parents are “on the left” of their children. Improvement suggested by @alejoe91 .

For example, the SortingAnalyzer extension waveforms depends on random_spikes. If you run

sorting_analyzer.compute([“waveforms”, “random_spikes”])

this fails as the compute function scans left to right. It’s a bit annoying since the user has tried to include random_spikes. It’s also easy to image a user running sorting_analyzer.compute([“waveforms”]). si tells the user they need to have random_spikes and they naturally add it on the right of the list. So: worth improving!

This change would automatically sort the inputted list so that all parents are on the left of their children. The function itself doesn’t check if the whole list is valid (i.e. it doesn’t return an error if random_spikes isn’t there when calculating waveforms), it only sorts it. These checks happen downstream.

The function I’ve written is a bit awkward, because the compute parameters are dictionaries (if you input a list, this gets converted to a dictionary internally, so kwargs can be easily handled.). And dictionaries aren’t really meant to have an order: so insertions at given indices are tricky etc. Here, I turn the dict into two lists, do the sorting, then zip. If anyone thinks of a cleaner algorithm, let me know.

Also added tests.

for more information, see https://pre-commit.ci

h-mayorquin · 2024-04-11T22:40:44Z

this fails as the compute function scans left to right. It’s a bit annoying since the user has tried to include random_spikes. It’s also easy to image a user running sorting_analyzer.compute([“waveforms”]). si tells the user they need to have random_spikes and they naturally add it on the right of the list. So: worth improving!

Isn't it easier to just tell the user to add the needed extension before the extension that generates the error?. I think this is easier to mantain and document. I also like it makes the dependency graph explicit on the list. All of this only matters if the dependency graph of the extensions is meant to become more complicated. If not, then either solution should work just fine.

If we go the way of parsing the dependency tree and re-order it as we do here we should be careful to not alter the list that the user passes as they might use it for something else and get an error because we modified their state. In concrete, we should copy at the beginning.

chrishalcrow · 2024-04-12T08:25:12Z

Isn't it easier to just tell the user to add the needed extension before the extension that generates the error?. I think this is easier to mantain and document. I also like it makes the dependency graph explicit on the list. All of this only matters if the dependency graph of the extensions is meant to become more complicated. If not, then either solution should work just fine.

You could be right. The error reporting also helps the user be aware of the parent/child structure which might help overall understanding. If we use an error reporting method, it could:

Just report "please put parents to the left of children"
Report the first error it finds "please put random_spikes to the left of waveforms"
Find and report all the errors as a list! "please put random_spikes to the left of waveforms. Please put waveform to the left of templates"
I like 3: helpful and educational. @alejoe91 , opinions?

If we go the way of parsing the dependency tree and re-order it as we do here we should be careful to not alter the list that the user passes as they might use it for something else and get an error because we modified their state. In concrete, we should copy at the beginning.

Yes, good point thanks! So I shouldn't have edited extensions - I'll change on the next update.

alejoe91 · 2024-04-12T08:28:42Z

I think it's not a mistake to pass ["templates", "random_spikes"] as input, so I don't think that the techicality of what depends o what is so educational that should trigger an error.

chrishalcrow · 2024-04-12T10:23:28Z

I've updated it so that extensions is not edited/updated. Thanks @h-mayorquin

So the other issue comes down to whether ['waveforms', 'random_spikes'] is an error or not. This decision is beyond my paygrade ;)

alejoe91 · 2024-04-12T10:52:47Z

I've updated it so that extensions is not edited/updated. Thanks @h-mayorquin

So the other issue comes down to whether ['waveforms', 'random_spikes'] is an error or not. This decision is beyond my paygrade ;)

I believe this should be allowed and it's not an error from the user! Especially when passing a dict as input to the compute, re-sorting the dict according to dependencies is unnecessarily complex

for more information, see https://pre-commit.ci

samuelgarcia · 2024-05-14T20:06:43Z

src/spikeinterface/core/tests/test_sortinganalyzer.py

+    assert sorted_extensions_4 == {"templates": {}, "random_spikes": {}, "waveforms": {}, "quality_metrics": {}}
+


We should have "random_spikes" at the first place no ?

Hello. Yes, the tests were all useless since dict equality doesn't care about order. So, at the testing stage, I've made them OrderedDicts.

dict is now ordered by default.

Yes, but equality doesn't care about the ordering, for backward compatibility. The following

a = {1: 3, 2: 4} b = {2: 4, 1: 3} a == b

outputs True

yes of course. We should do
list(a.key()) == list(b.keys())
no ?

Sure, both seem to work, but if that's standard practice then let's go for keys. I've updated the tests, including getting rid of the "magic" sorted dictionaries.

src/spikeinterface/core/sortinganalyzer.py

samuelgarcia · 2024-05-14T20:11:43Z

Hi Chris.
I am reading this very very late!!
I am sorry.
I did a few comments.

for more information, see https://pre-commit.ci

samuelgarcia · 2024-05-21T13:29:48Z

merci!

chrishalcrow and others added 3 commits April 11, 2024 12:36

Automatically sort extensions input

fcea7c0

Edit docu

b5ed8d9

[pre-commit.ci] auto fixes from pre-commit.com hooks

87a3084

for more information, see https://pre-commit.ci

chrishalcrow added the enhancement New feature or request label Apr 11, 2024

Merge branch 'main' into sort_input_extensions

79fec51

Not long modify extensions

53fbc73

chrishalcrow and others added 4 commits April 12, 2024 12:49

Merge branch 'main' into sort_input_extensions

3a2deb3

change one sorted_extensions back to extensions

7e6b77e

Merge branch 'main' into sort_input_extensions

3e40884

[pre-commit.ci] auto fixes from pre-commit.com hooks

369fe59

for more information, see https://pre-commit.ci

samuelgarcia reviewed May 14, 2024

View reviewed changes

src/spikeinterface/core/sortinganalyzer.py Show resolved Hide resolved

chrishalcrow and others added 4 commits May 15, 2024 08:14

Merge branch 'main' into sort_input_extensions

c2beac9

fix testing, and add | case

51659e3

[pre-commit.ci] auto fixes from pre-commit.com hooks

73fcbb0

for more information, see https://pre-commit.ci

Replace OrderedDict with getting keys

b0cc5ad

samuelgarcia approved these changes May 21, 2024

View reviewed changes

samuelgarcia merged commit 26c145c into SpikeInterface:main May 21, 2024
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sort input extensions #2705

Sort input extensions #2705

chrishalcrow commented Apr 11, 2024

h-mayorquin commented Apr 11, 2024

chrishalcrow commented Apr 12, 2024

alejoe91 commented Apr 12, 2024

chrishalcrow commented Apr 12, 2024

alejoe91 commented Apr 12, 2024

samuelgarcia May 14, 2024

chrishalcrow May 15, 2024

samuelgarcia May 15, 2024

chrishalcrow May 15, 2024

samuelgarcia May 15, 2024

chrishalcrow May 15, 2024

samuelgarcia commented May 14, 2024

samuelgarcia commented May 21, 2024

		assert sorted_extensions_4 == {"templates": {}, "random_spikes": {}, "waveforms": {}, "quality_metrics": {}}

Sort input extensions #2705

Sort input extensions #2705

Conversation

chrishalcrow commented Apr 11, 2024

h-mayorquin commented Apr 11, 2024

chrishalcrow commented Apr 12, 2024

alejoe91 commented Apr 12, 2024

chrishalcrow commented Apr 12, 2024

alejoe91 commented Apr 12, 2024

samuelgarcia May 14, 2024

Choose a reason for hiding this comment

chrishalcrow May 15, 2024

Choose a reason for hiding this comment

samuelgarcia May 15, 2024

Choose a reason for hiding this comment

chrishalcrow May 15, 2024

Choose a reason for hiding this comment

samuelgarcia May 15, 2024

Choose a reason for hiding this comment

chrishalcrow May 15, 2024

Choose a reason for hiding this comment

samuelgarcia commented May 14, 2024

samuelgarcia commented May 21, 2024