Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: parameterize SortedList used in SortedDict through inheritance #204

Open
Serpens66 opened this issue Sep 4, 2022 · 5 comments

Comments

@Serpens66
Copy link

Serpens66 commented Sep 4, 2022

Hi, thanks for your work :)

would it be possible to add a new function to the SortedDict (and therefore also SortedList) which sets a key a new value and returns the index position in one go?
Eg.: index = sorteddict.setitem_returnindex(key,value)

Usecase:
I want to add a new key,value to the SortedDict and I want to know at which total index this key was sorted in. Since performance is important, I assume that such a combined method would be much faster than sorting it in and then again search the key to get the index.

I assume that when sorting the new key into the SortedDict, we already search for the correct index to insert it. So we should be able to simply return this index and then there is no need to search for it twice.
I already read the source code and tried to add it myself, but: 1) sortedcontainers is still frequently updated, I don't want to overwrite eg. the init of SortedDict to make it use a custom SortedList, because this might mess with future updates. 2) I did not fully understand the internal structure with lists and maxes yet :D And it seems bisect "insort" also does not return the index, which makes it more difficult =/ (I really don't understand why "insort" does not return the index, it would be so easy -.- )

@grantjenks
Copy link
Owner

See #195

@Serpens66
Copy link
Author

Serpens66 commented Sep 5, 2022

I created an issue:
python/cpython#96571
Maybe you can vouch for it, to make it more likely to happen.

In the meantime you still may adjust sortedcontainers I guess?
Just do sth like:
index = insort(...)
return index

This will be None if the python source code will not change, just like it currently is, but automatically gets the index as soon python changes it.
And in case your add(...) function does not use insort, you are already able to return the index, so this would already be an improvement if the index is 0 or the len().

@Serpens66
Copy link
Author

If you want to add sth to the cpython issue, please let us know.

If not, it seems the devs there convinced me that there is no urgent need to change "insort" to return the index, because doing bisect and then list.insert(...) one after the other would be the same speed...
If this is true, then as you wrote in issue 195, the user can add a custom "add" function on their own by inheriting from SortedList.
...
But to achieve the wanted result for SortedDict, I still need to exchange the internal SortedList with my custom SortedList. And this is currently only possible by overwriting either the init of SortedDict or replace the self._list after initialization.. while both of it may easily break if some code from SortedDict changes.
Do you know a proper solution?

@grantjenks
Copy link
Owner

In #195, I mentioned "The other problem ..." but the first problem still remains (and is the bigger issue):

"""Getting the index of the item requires the positional index to be built which add() specifically avoids. I think this is a fine example for simply inheriting and implementing the needed functionality in your own code."""

I still need to exchange the internal SortedList with my custom SortedList.

This is a good point. SortedDict is not parameterized on the SortedList type. I don't think I'd want to make it an argument to __init__, but I would consider making it a class var so the reference in __init__ became self.SortedList(...). Then inheriting would mean simply changing the class variable. Something like:

from .sortedlist import SortedList

class SortedDict:
    SortedList = SortedList

    def __init__...:
        self._list = self.SortedList(...)

and your code would do:

class MySortedDict(SortedDict):
    SortedList = MySortedList

@Serpens66
Copy link
Author

Thank you.
I never used this "define variables in classes outside of a method" yet, but if it works than this is a good solution, thanks :)

What do you think how much impact it will have to exchange the insort call with bisect_right and list.insert and returning the index in my custom sortedlist? Like 10% slower or more/less? Maybe you already made some tests regarding this and therefore decided against it.

@grantjenks grantjenks changed the title Feature request: SortedDict setitem and return index in one call Feature request: parameterize SortedList used in SortedDict through inheritance Oct 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants