Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove automated sorting of data #812

Open
danielhuppmann opened this issue Feb 14, 2024 · 2 comments
Open

Remove automated sorting of data #812

danielhuppmann opened this issue Feb 14, 2024 · 2 comments

Comments

@danielhuppmann
Copy link
Member

The pyam package currently automatically sorts the _data series and meta dataframe by their index. This makes it easy for consistency, assert-frame-equal and some operations like interpolation. But it can have unintended consequences in cases where ordering is forgotten, e.g. #811

Also, the repeated ordering is probably not very resource-efficient for large IamDataFrame instances.

For pyam 3.0, I suggest to drop the automated ordering on initialization and rename/aggregation/etc. methods, and instead provide a sort() method that can be called explicitly. We could also have a kwarg on all relevant methods whether to sort, but that may not effective on the effort-vs.-benefit trade-off.

@phackstock @gidden @znicholls, any thoughts?

@phackstock
Copy link
Contributor

I like the idea of making sorting optional. I cannot really think of a use case off the top of my head where I care or depend on the order of data.
For assert-frame-equal we would then also introduce a keyword argument that would switch whether or not order is considered when checking for equality.

@danielhuppmann
Copy link
Member Author

Reminder: not sorting the time column may cause confusion when working with the wide timeseries format (e.g., write to xlsx)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants