Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implementation for fast categorize #819

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
3 changes: 3 additions & 0 deletions RELEASE_NOTES.md
Expand Up @@ -6,6 +6,9 @@ Bumped minimum version of pandas and numpy to fit **ixmp4**'s requirement.

## Individual updates

- [#819](https://github.com/IAMconsortium/pyam/pull/819) Speed up `categorize()`
in line with `validate()` improvements in
[#804](https://github.com/IAMconsortium/pyam/pull/804)
- [#832](https://github.com/IAMconsortium/pyam/pull/832) Improve the test-suite for the ixmp4 integration
- [#827](https://github.com/IAMconsortium/pyam/pull/827) Migrate to poetry for project management
- [#830](https://github.com/IAMconsortium/pyam/pull/830) Implement more consistent logging behavior with **ixmp4**
Expand Down
31 changes: 22 additions & 9 deletions pyam/core.py
Expand Up @@ -33,9 +33,7 @@
_group_and_agg,
)
from pyam.compute import IamComputeAccessor
from pyam.filter import (
datetime_match,
)
from pyam.filter import datetime_match
from pyam.index import (
append_index_col,
get_index_levels,
Expand Down Expand Up @@ -68,7 +66,7 @@
to_list,
write_sheet,
)
from pyam.validation import _apply_criteria, _exclude_on_fail, _validate
from pyam.validation import _exclude_on_fail, _validate

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -919,7 +917,15 @@
self.set_meta(meta, name)

def categorize(
self, name, value, criteria, color=None, marker=None, linestyle=None
self,
name,
value,
criteria: dict = None,
*,
color=None,
marker=None,
linestyle=None,
**kwargs,
):
"""Assign scenarios to a category according to specific criteria

Expand All @@ -940,18 +946,25 @@
assign a linestyle to this category for plotting
"""
# add plotting run control

for kind, arg in [
("color", color),
("marker", marker),
("linestyle", linestyle),
]:
if arg:
run_control().update({kind: {name: {value: arg}}})
# find all data that matches categorization
rows = _apply_criteria(self._data, criteria, in_range=True, return_test="all")
idx = make_index(rows, cols=self.index.names)

if len(idx) == 0:
# find all data that matches categorization
# TODO: if validate returned an empty index, this check would be easier
not_valid = self.validate(criteria=criteria, **kwargs)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here @danielhuppmann the validate kwargs like upper_bound and other filtering args are taken in

if not_valid is None:
idx = self.index

Check warning on line 962 in pyam/core.py

View check run for this annotation

Codecov / codecov/patch

pyam/core.py#L962

Added line #L962 was not covered by tests
elif len(not_valid) < len(self.index):
idx = self.index.difference(
not_valid.set_index(["model", "scenario"]).index.unique()
)
else:
logger.info("No scenarios satisfy the criteria")
return

Expand Down