Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes proposed in #885. Don't register handlers by default. #889

Merged
merged 11 commits into from Jan 13, 2020

Conversation

PGijsbers
Copy link
Collaborator

@PGijsbers PGijsbers commented Nov 26, 2019

Fixes #885.

First implementation of changes suggested in #885:

  • don't register log handlers by default
  • don't set openml logger log level by default

A user can now specify the console and file log levels through set_console_log_level and set_file_log_level respectively. As input they expect the log level (both OpenML's 0/1/2 and Python log levels are accepted), and a bool save_as_default. Actually saving to the config file is right now not supported (and will raise a NotImplementedError). Calling these functions registers the respective handler if it wasn't yet, set/change it to the appropriate value, and ensures the openml logger is set to the right log level.

Consider this code:

import logging

if __name__ == '__main__':
    log = logging.getLogger('mypackage')
    log.setLevel(logging.DEBUG)

    log.info("Message one")
    # 1: just import OpenML
    import openml
    # 2: Specify console log
    # openml.config.set_console_log_level(logging.DEBUG)
    # 3: Specify file log
    # openml.config.set_file_log_level(logging.DEBUG)
    log.info("Message two")

    from openml.flows import get_flow
    f = get_flow(123)
    f.description = ''
    f._to_dict()  # logging.warning will be called due to empty description.

    from openml.datasets import get_dataset
    iris = get_dataset(61)  # since I have it in cache, a logging.debug statement is called.

The output we get, uncommenting respective lines for scenario '2' and '3':

1: Flow 'weka.LinearRegression's empty description to console only. This is default behavior for warning level messages. The debug message from the last line is ignored.
2:

[WARNING] [11:51:24:openml.flows.flow] Flow 'weka.LinearRegression's empty description
[DEBUG] [11:51:24:openml.datasets.dataset] Data pickle file already exists and is up to date.

to console only.

3 (the line specifying console output was commented out again):

[WARNING] [11:53:46:openml.flows.flow] Flow 'weka.LinearRegression's empty description
[DEBUG] [11:53:46:openml.datasets.dataset] Data pickle file already exists and is up to date.

written to file, no console output. Note that the warning message is now also not printed to console.


This does not include changes to where the config/cache directories are located.

Writing to the config file is not yet supported. I noticed we hack our way around the configparser module. If we want to support writing to the log file from the python module (which I'm for), we need to look at other options or implement our own (our log file definition is very simple). In fact,
I would further propose refactoring the openml.config module into smaller submodules as a lot of different things are going on within the same module now (loading, logging, test configuration).

I'm willing to do the above refactoring, but it might take a few weeks (I'm very busy until the holidays).

Copy link
Collaborator

@mfeurer mfeurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, but I think @mitar should have a look at this, too. Also, I don't understand what the code for storing something in the config is supposed to do?

@PGijsbers
Copy link
Collaborator Author

Also, I don't understand what the code for storing something in the config is supposed to do?

Nothing right now (it's not implemented). But the idea is that a user can adjust their log settings (or perhaps all settings, e.g. apikey) programmatically, and persist those settings in the configuration file. This way a user can make permanent configuration changes without having to manually find/edit the configuration file.

@mfeurer
Copy link
Collaborator

mfeurer commented Nov 28, 2019

This way a user can make permanent configuration changes without having to manually find/edit the configuration file.

Hm, but wouldn't a user put the configuration of logging in the startup of an application, and want it to be different for different applications using OpenML? Basically, I don't think this is something we should handle in OpenML.

@PGijsbers
Copy link
Collaborator Author

Then what is the point of including verbosity options in the configuration file?

@mfeurer
Copy link
Collaborator

mfeurer commented Nov 28, 2019

Then what is the point of including verbosity options in the configuration file?

Good point. We decided upon that once upon a time, but I'm not convinced that that was a good idea.

@mitar
Copy link
Member

mitar commented Dec 2, 2019

I think config file could control only when running inside a interactive interpreter, or Jupyter, or something like that. Something like this could work?

But yes, I am also not convinced that having that in a configuration is reasonable. Because even if you are using interactive interpreter, it might be that openml library is used as a 3rd party to some other library and you are not really expecting to get messages from it.

openml/config.py Show resolved Hide resolved
openml/config.py Outdated Show resolved Hide resolved
file_output_level: Optional[int] = None,
save_as_default: bool = False
):
""" Set file output to the desired level and register it with openml logger if needed. """
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "register it with openml logger" is implementation detail. Maybe just saying that "configured logging to a file (if not already) and sets the desired level".

openml/config.py Show resolved Hide resolved
@PGijsbers
Copy link
Collaborator Author

Based on the feedback I would consider finishing the PR with no longer supporting the log level properties of the openml configuration file. This means that each time the user wants to configure openml logging, it either does so through conventional Python usage or by calling one of the helper functions in config.

@mfeurer
Copy link
Collaborator

mfeurer commented Dec 4, 2019

Based on the feedback I would consider finishing the PR with no longer supporting the log level properties of the openml configuration file.

Fine by me, maybe we should drop this from the OpenML cache format. @janvanrijn @giuseppec are you using the fact that there's a verbosity flag in the OpenML config file in the Java and R toolkit.

@codecov-io
Copy link

codecov-io commented Dec 18, 2019

Codecov Report

Merging #889 into develop will decrease coverage by 0.33%.
The diff coverage is 51.61%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #889      +/-   ##
===========================================
- Coverage     88.5%   88.17%   -0.34%     
===========================================
  Files           37       37              
  Lines         4324     4337      +13     
===========================================
- Hits          3827     3824       -3     
- Misses         497      513      +16
Impacted Files Coverage Δ
openml/config.py 78.99% <51.61%> (-11.58%) ⬇️
openml/runs/functions.py 82.83% <0%> (-0.28%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d5e46fe...ebc0a5f. Read the comment docs.

@PGijsbers
Copy link
Collaborator Author

@mfeurer do you know what is going on with these errors? It seems these errors stem from the develop branch.

@PGijsbers
Copy link
Collaborator Author

I'm thinking it is due to an expired ssl certificate (https://test.openml.org/ also gives off a warning). I'll ask Joaquin to update the certificate.

@PGijsbers
Copy link
Collaborator Author

There seems to be an issue with the scikit-learn downloaded in the Appveyor CI (though it reports it tried to install scikit-learn 0.20).

 @unittest.skipIf(LooseVersion(sklearn.__version__) < "0.20",
                     reason="columntransformer introduction in 0.20.0")
    def test_run_and_upload_column_transformer_pipeline(self):
        import sklearn.compose
>       import sklearn.impute
tests\test_runs\test_run_functions.py:535: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
c:\miniconda35-x64\lib\site-packages\sklearn\impute\__init__.py:4: in <module>
    from ._knn import KNNImputer
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
    import numpy as np
    
    from ._base import _BaseImputer
    from ..utils.validation import FLOAT_DTYPES
    from ..metrics import pairwise_distances
>   from ..metrics.pairwise import _NAN_METRICS
E   ImportError: cannot import name '_NAN_METRICS'
c:\miniconda35-x64\lib\site-packages\sklearn\impute\_knn.py:10: ImportError

@PGijsbers PGijsbers marked this pull request as ready for review January 8, 2020 14:10
@mfeurer
Copy link
Collaborator

mfeurer commented Jan 8, 2020

There seems to be an issue with the scikit-learn downloaded in the Appveyor CI (though it reports it tried to install scikit-learn 0.20).

Hm, would increasing the version help here?

@mitar
Copy link
Member

mitar commented Jan 8, 2020

Use Let's Encrypt and stop worrying about SSL. :-)

@PGijsbers
Copy link
Collaborator Author

For Appveyor, scikit-learn now installed through pip instead of conda, and that seems to avoid the issues. Also bumped the scikit-learn version installed with Appveyor to 0.21 (no particular reason as that is still left over from trying to see if changing the version avoided the issue). If you want me to revert it to 0.20, or bump it to 0.22 instead I can do that too.

Copy link
Collaborator

@mfeurer mfeurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this. @mitar do you have any further comments?

Copy link
Member

@mitar mitar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thanks!

@PGijsbers PGijsbers merged commit b37b261 into develop Jan 13, 2020
@PGijsbers PGijsbers deleted the Fix885 branch January 13, 2020 10:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Package should not configure logging without being asked to do so
4 participants