Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance plot_parallel_coordinate() by eliminating redundant for-loops #5409

Open
c-bata opened this issue Apr 22, 2024 · 0 comments
Open

Enhance plot_parallel_coordinate() by eliminating redundant for-loops #5409

c-bata opened this issue Apr 22, 2024 · 0 comments
Labels
code-fix Change that does not change the behavior, such as code refactoring. contribution-welcome Issue that welcomes contribution.

Comments

@c-bata
Copy link
Member

c-bata commented Apr 22, 2024

Motivation

In parallel coordinate plot, _is_log_scale(), _is_categorical(), and _is_numerical() functions are invoked.

if _is_log_scale(trials, p_name):
values = [math.log10(v) for v in values]
min_value = min(values)
max_value = max(values)
tickvals: list[int | float] = list(
range(math.ceil(min_value), math.floor(max_value) + 1)
)
if min_value not in tickvals:
tickvals = [min_value] + tickvals
if max_value not in tickvals:
tickvals = tickvals + [max_value]
dim = _DimensionInfo(
label=_truncate_label(p_name),
values=tuple(values),
range=(min_value, max_value),
is_log=True,
is_cat=False,
tickvals=tickvals,
ticktext=["{:.3g}".format(math.pow(10, x)) for x in tickvals],
)
elif _is_categorical(trials, p_name):
vocab: defaultdict[int | str, int] = defaultdict(lambda: len(vocab))
ticktext: list[str]
if _is_numerical(trials, p_name):

Each one of these functions iterates over the trials, which is clearly redundant.

def _is_log_scale(trials: list[FrozenTrial], param: str) -> bool:
for trial in trials:
if param in trial.params:
dist = trial.distributions[param]
if isinstance(dist, (FloatDistribution, IntDistribution)):
if dist.log:
return True
return False
def _is_categorical(trials: list[FrozenTrial], param: str) -> bool:
return any(
isinstance(t.distributions[param], CategoricalDistribution)
for t in trials
if param in t.params
)
def _is_numerical(trials: list[FrozenTrial], param: str) -> bool:
return all(
(isinstance(t.params[param], int) or isinstance(t.params[param], float))
and not isinstance(t.params[param], bool)
for t in trials
if param in t.params
)

Suggestion

Making plot_parallel_coorinate() faster by removing wasted for-loop.

Additional context (optional)

No response

@c-bata c-bata added contribution-welcome Issue that welcomes contribution. code-fix Change that does not change the behavior, such as code refactoring. labels Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code-fix Change that does not change the behavior, such as code refactoring. contribution-welcome Issue that welcomes contribution.
Projects
None yet
Development

No branches or pull requests

1 participant