[UX] Use of AutomaticParameterWarning and `"auto"` parameters #245

eddiebergman · 2024-01-30T20:27:37Z

As things have developed, there's been a few instances where we can set some default in an automatic fashion, however it may not be clear to the user that we are making a hueristic choice, and it would be better if they specified this.

While it was not declared anywhere, I think it makes sense to make some rule set that essentially:

If you are making some "auto" choice for a user, raise a warning indicating what was done and how to silence it.
These should inherit from the AutomaticParameterWarning introduced in feat(sklearn): Provide a standard CVEvaluator #244.

# amltk.exceptions
class AutomaticParameterWarning(UserWarning): ...

class AutomaticBlubWarning(AutomaticParameterWarning):
    """We can deduce that we often want to 'blub' but this can backfire silently when x, y, z."""
    
    
# amltk.blub
from amltk.exceptions import AutomaticBlubWarning

def function_that_blubs(is_blub: bool | Literal["auto"] = "auto") -> BlubBatch:
	if is_blub == "auto":
        warnings.warn(
        	"blub blub blub. To silence this warning, please specify"
        	" function_that_blubs(is_blub=...)",
            AutomaticBlubWarning,
            stacklevel=2,
        )

This should remain an issue until:

The UserWarning and automatic parameters in AMLTK have been unified
- UserWarning derived from the use of "auto" should be replaced be a unique warning type.
Functions relying on the f(x: T | None = None) that make heuristic choices should be converted to use f(x: T | Literal["auto"] = "auto"). It's important to distinguish the absence of a value with the activation of an automatic heuristic. It might be good to give examples here:

# Here the `None` indicates the absence of a value and some default directory/name is fine.
def f(working_dir: Path | None = None):
	"""Uses timestamp if left as None"""
	...

# Here the `None` is indicating the absence of a choice.
# This applies a heuristic to make a choice and should be indicated as such
def run(display: bool | None = None):
	"""Display the progress. By default, this is true in a notebook, otherwise not."""
	...

# Prefer instead:
def run(display: bool | Literal["auto"] = "auto"): ...

The text was updated successfully, but these errors were encountered:

eddiebergman · 2024-01-30T20:28:03Z

@LennartPurucker Please let me know what you think of this when you have time. It's come up a few times now

LennartPurucker · 2024-01-31T08:58:23Z

I am actually not sure about the first case either:

# Here the `None` indicates the absence of a value and some default directory/name is fine.
def f(working_dir: Path | None = None):
	"""Uses timestamp if left as None"""
	...

If this is a private function, None is good. But if this is a public function, then I would also set this to "auto" or the option "timestamp". Right now, from the function definition, while using the function in an editor, I could not tell what happens by default without reading the docstring.

Sidenote: IMO, the typical behavior of defaulting to timestamp from a None input is very unintuitive sometimes and actually not fully thread/multi-processing safe! Thus, I would rather know about this as a user. If the function prints the paths anyway, an additional warning might not be needed.

LennartPurucker · 2024-01-31T08:59:15Z

Otherwise, this is a very good thing. Also, I am not yet fully sure about the code overhead of custom warnings. I guess custom warnings are better code quality and allow for catching/ignoring them, right?

eddiebergman · 2024-01-31T12:10:49Z

I am actually not sure about the first case either:
# Here the `None` indicates the absence of a value and some default directory/name is fine.
def f(working_dir: Path | None = None):
	"""Uses timestamp if left as None"""
	...
If this is a private function, None is good. But if this is a public function, then I would also set this to "auto" or the option "timestamp". Right now, from the function definition, while using the function in an editor, I could not tell what happens by default without reading the docstring.

Sidenote: IMO, the typical behavior of defaulting to timestamp from a None input is very unintuitive sometimes and actually not fully thread/multi-processing safe! Thus, I would rather know about this as a user. If the function prints the paths anyway, an additional warning might not be needed.

I agree in principle but as a matter of ergonomics, I would still call this None. I can't articulate why (which is why there is this issue, trying to figure it out) but it's something along the lines of a default parameter can be provided which works well in most cases, which otherwise would require a user to explicitly provide one, or the function would not work.

In contrast to the other example, we are using a hueristic to make a choice amongst a discrete set, which could otherwise cause a lot of hidden behavior to occur. Another example being the threadpoolctl heuristic.

The question is where to draw a line. An optional parameter where the default parameter value is None is different than an auto parameter.

I guess I would say that an optional with None as the default will be set to some default which is independant of the system you are on and the other parameters, i.e. it will always be a timestamped path. Where as an auto will apply some hueristic depending on the other parameters or the system in which the program is being run.

Objections, comments, feedback?

eddiebergman · 2024-01-31T12:15:26Z

Otherwise, this is a very good thing. Also, I am not yet fully sure about the code overhead of custom warnings. I guess custom warnings are better code quality and allow for catching/ignoring them, right?

I would consider it low overhead on the development side and yes, it allows for distuinguishing them more easily. Anything that wants to interact with that warning can do so by type rather than by content. This allows us to for example, change the wording in the warning without breaking any system that relies on the type for functionality.

This also applies to errors:

# Needed to make sure a certain exception occurs
# Will break if someone decides to make it a proper sentence and capitalize "Blub"
with pytest.raises(ValueError, match="blub should be provided"):
    ...

# Only breaks if the underlying exception type changes
with pytest.raises(BlueError):
    ...

You can see this in effect in one of the doc hooks, which disables warnings from the running examples in the doc strings:

amltk/docs/hooks/cleanup_log_output.py

Lines 21 to 31 in 0af8f7f

    
           @mkdocs.plugins.event_priority(-50) 
        
           def on_startup(**kwargs: Any): 
        
               # We get a load of deprecation warnings from SMAC 
        
               warnings.filterwarnings("ignore", category=DeprecationWarning) 
        
               # We ignore AutoWarnings as our example tend to rely on 
        
               # a lot of the `"auto"` parameters 
        
               warnings.filterwarnings("ignore", category=AutomaticParameterWarning) 
        
               # ConvergenceWarning from sklearn 
        
               warnings.filterwarnings("ignore", module="sklearn")

LennartPurucker · 2024-01-31T12:56:37Z

From our in-person discussion:

Cases:

Optional (e.g., sample weights)
Defaults (e.g., working dir)
Autos (e.g., visualization)

eddiebergman · 2024-02-02T12:44:28Z

So warnings can be done at Runtime and I don't want to automate this in anyway with concrete types... too much overhead. However I would assume most people would have access to some sort of LSP that will give them the type definitions. We can inject this information there at the very least.

But I thought at least from type definitions we can be a bit more explicit, for example:

It should have a near-zero runtime impact and is mainly for type checking, you can see a few variants here:

from __future__ import annotations

from pathlib import Path
from typing import TypeAlias, TypeVar

T = TypeVar("T")

# Basically says these types are either given a type `T` or are `None`
# The reason for `| None` is so that setting the default value to `None` works correctly
# in signatures
Auto: TypeAlias = type[T] | None
Default: TypeAlias = type[T] | None



# Usage 1
def f1(display: Auto[bool] = None) -> None:
    if display is None:
        # Make some automatic choice
        ...

# Or this? _Auto would be a singleton defined once
_Auto = None 

def f2(display: Auto[bool] = _Auto) -> None:
    if display is _Auto:
        # Make some automatic choice
        ...

# Same works for default
def h(path: Default[Path] = None) -> None:
    if path is None:
        # Create some default path
        ...

I'll be honest, I feel all of this will just make things more complicated and confusing. I would rather in the end just use the Path | None = None idiom.

Maybe could do the following to keep it as simple as possible

Auto = None
def f(p: Path | None = Auto):
	pass

The LSP still gives enough info and it doesn't deref to be None. (At least for pyright which is the core of vscode's pylance, so they should have similar type information displayed)

eddiebergman added decision A decision needs to be made ux Something to make user experience better labels Jan 30, 2024

eddiebergman self-assigned this Jan 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[UX] Use of AutomaticParameterWarning and `"auto"` parameters #245

[UX] Use of AutomaticParameterWarning and `"auto"` parameters #245

eddiebergman commented Jan 30, 2024 •

edited

eddiebergman commented Jan 30, 2024

LennartPurucker commented Jan 31, 2024

LennartPurucker commented Jan 31, 2024 •

edited

eddiebergman commented Jan 31, 2024

eddiebergman commented Jan 31, 2024

LennartPurucker commented Jan 31, 2024

eddiebergman commented Feb 2, 2024 •

edited

[UX] Use of AutomaticParameterWarning and "auto" parameters #245

[UX] Use of AutomaticParameterWarning and "auto" parameters #245

Comments

eddiebergman commented Jan 30, 2024 • edited

eddiebergman commented Jan 30, 2024

LennartPurucker commented Jan 31, 2024

LennartPurucker commented Jan 31, 2024 • edited

eddiebergman commented Jan 31, 2024

eddiebergman commented Jan 31, 2024

LennartPurucker commented Jan 31, 2024

eddiebergman commented Feb 2, 2024 • edited

[UX] Use of AutomaticParameterWarning and `"auto"` parameters #245

[UX] Use of AutomaticParameterWarning and `"auto"` parameters #245

eddiebergman commented Jan 30, 2024 •

edited

LennartPurucker commented Jan 31, 2024 •

edited

eddiebergman commented Feb 2, 2024 •

edited