Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ultranest sets up logging like a standalone application, not a library #31

Open
jpl-jengelke opened this issue Jun 22, 2021 · 10 comments
Open

Comments

@jpl-jengelke
Copy link

  • UltraNest version: 3.2.1
  • Python version: 3.9.5
  • Operating System: Mac OS Catalina

Thank you for Ultranest as it is an incredible nested sampling library -- very fast and efficient. In the interest of improving the application for scientific users, this ticket is submitted to simplify logging configurations.

Description

We are running ultranest.ReactiveNestedSampler(...).run(..., show_status=self.verbose, viz_callback=self.verbose). We set show_statusandviz_callback` to False. We expected logging of every pointrun to stop. It didn't.

This is a request to simplify logging configuration to further the intent of science API users.

What I Did

I traced through Ultranest to witness how logging is set. In method create_logger(...) it is actually instantiating an entire logging setup. This is unusual for a Python application library. Usually, they are simply setting a placeholder per class (or file) to build off the importing application's log configuration by declaring, logger = logging.getLogger(__name__). However, Ultranest is requiring users to go through a complex procedural setup that includes adding a custom log handler just to override internals. To wit, something like this is required simply to change log levels (an additional log configuration to that for the importing application):

ultranest_logger = logging.getLogger("ultranest")
ultranest_handler = logging.StreamHandler(sys.stdout)
ultranest_handler.setLevel(logging.INFO)
ultranest_formatter = logging.Formatter("%(levelname)s:{}:%(message)s".format("ultranest"))
ultranest_handler.setFormatter(ultranest_formatter)
ultranest_logger.addHandler(ultranest_handler)
ultranest_logger.setLevel(logging.INFO)

(The current documented instructions have an error. NOTE the last line and the fourth line which differ from the Q&A. The last line is required to actually override the entire logger's level as setting the handler level directly did not have an effect. Also, the Formatter(...) clause in line four references module_name which is effectively an undefined variable and should be simply "ultranest".)

Many other API libraries simply require a single line bolted on to the importing application's root logging:

logging.getLogger("matplotlib.font_manager").setLevel("WARN")

Paste the command(s) you ran and the output.
If there was a crash, please include the traceback here.

When the importing application itself is misconfigured logging consists of a separate line added into logs for each and every iteration of the sampler. This results in extended and long log files with tens or hundreds of thousands (or more) extra lines of text. Such log files are very large and difficult to review for errors.

Please simplify the logging or at least make the show_status and viz_callback parameters to the run(...) class of methods suppress the extended logging as appropriately set.

@jpl-jengelke
Copy link
Author

jpl-jengelke commented Jun 22, 2021

Also, I should note that although the above setup is required, under certain circumstances there will be double logging. That is because the log statement can bubble up to the root while the (required) logging handler still logs. Of course, the level can be changed for the handler to logger.CRITICAL + 1. Another solution (YMMV):

ultranest_logger = logging.getLogger("ultranest")
ultranest_logger.addHandler(logging.NullHandler())
ultranest_logger.setLevel(logging.INFO)

@JohannesBuchner
Copy link
Owner

show_status and viz_callback control program feedback, which is different to logging.

Users may run ultranest in the following ways:

  • in a python script
  • in a jupyter notebook (which has its own weird root logger behaviour)
  • integrated within a larger program

There should be a way to have a good automatic behaviour for the first two cases, so that the user does not need to set up anything. When log_dir is set, it is nice to have a full debug.log file, especially if the run grinds to a halt.
In the third case, the application can adjust the logging behaviour and catch warnings.

What solution would you suggest?

@JohannesBuchner
Copy link
Owner

JohannesBuchner commented Jun 22, 2021

That is because the log statement can bubble up to the root while the (required) logging handler still logs

How does this happen and how can it be prevented?

@jpl-jengelke
Copy link
Author

That is because the log statement can bubble up to the root while the (required) logging handler still logs

How does this happen and how can it be prevented?

It might be an artifact of setting up logging using logging.basicConfig(...) and relying on STDOUT. Unfortunately I have not really deep dived into this.

Wouldn't the "in a python script" and "integrated within a larger program" scenarios mean Ultranest is just being imported as a dependency? Does it ever run standalone? I will check with my team on the Jupyter notebooks.

@JohannesBuchner
Copy link
Owner

JohannesBuchner commented Jun 22, 2021

Yes it is imported the same way. A casual user will likely want to see the warnings in stdout, and the log output in log_dir/debug.log (at least, it is useful for me when they send me that when issues arise), without setting up a log handler in every script. When packaged inside a larger application, the application may know which information is important to pass to the user, and which is noise, therefore may want to filter/direct the logging. At least that was my thinking when I made these design choices ...

@jpl-jengelke
Copy link
Author

Maybe there is a way to feed in a boolean to turn off log instantiation during class instantiation. In other words, log creation takes place through a method call in class __init__. Otherwise, all files can just declare a log handler. It's on by default unless a user sets a variable, native_logging, to False. Something like that?

@JohannesBuchner
Copy link
Owner

There is another issue with log_dir. I think if ReactiveNestedSampler is run on one directory, and then another ReactiveNestedSampler on another object, I think it resumes logging into the first directory?

@JohannesBuchner
Copy link
Owner

I finally found, read, and understood this page https://docs.python.org/3/howto/logging.html#configuring-logging-for-a-library

The latest code in master adds a NullHandler as well: https://github.com/JohannesBuchner/UltraNest/blob/master/ultranest/utils.py#L55
But I am not sure this is enough?

I appreciate the report and am willing to make improvements on this.

I corrected the docs following your comments.

I am confused however why this
logging.getLogger("ultranest").setLevel("WARN")
is not enough. I don't quite understand the interplay between the logger level and the handler level. I would have thought one filters the other, and then we are set.

@jpl-jengelke
Copy link
Author

jpl-jengelke commented Sep 9, 2022

I am confused however why this logging.getLogger("ultranest").setLevel("WARN") is not enough. I don't quite understand the interplay between the logger level and the handler level. I would have thought one filters the other, and then we are set.

Right, it should. I have to review my first comments and the code again before providing a response. It turns out that configuring logging.basicConfig(level=...) comes with its own side effects. (See comment on this response.) Also, in that case, every logger in the system is set to the level specified by basicConfig(...). That's OK in a sense but certain modules are extremely verbose so we set some of those to higher levels.

@jpl-jengelke
Copy link
Author

jpl-jengelke commented Aug 5, 2023

I think this StackOverflow response concisely sums up the mechanics of granular but non-configured logging:
https://stackoverflow.com/a/50751987/325452

I've also recently stumbled upon a somewhat similar logging controversy at Matplotlib: matplotlib/matplotlib#13264

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants