Changes proposed in #885. Don't register handlers by default. #889

PGijsbers · 2019-11-26T11:02:51Z

Fixes #885.

First implementation of changes suggested in #885:

don't register log handlers by default
don't set openml logger log level by default

A user can now specify the console and file log levels through set_console_log_level and set_file_log_level respectively. As input they expect the log level (both OpenML's 0/1/2 and Python log levels are accepted), and a bool save_as_default. Actually saving to the config file is right now not supported (and will raise a NotImplementedError). Calling these functions registers the respective handler if it wasn't yet, set/change it to the appropriate value, and ensures the openml logger is set to the right log level.

Consider this code:

import logging

if __name__ == '__main__':
    log = logging.getLogger('mypackage')
    log.setLevel(logging.DEBUG)

    log.info("Message one")
    # 1: just import OpenML
    import openml
    # 2: Specify console log
    # openml.config.set_console_log_level(logging.DEBUG)
    # 3: Specify file log
    # openml.config.set_file_log_level(logging.DEBUG)
    log.info("Message two")

    from openml.flows import get_flow
    f = get_flow(123)
    f.description = ''
    f._to_dict()  # logging.warning will be called due to empty description.

    from openml.datasets import get_dataset
    iris = get_dataset(61)  # since I have it in cache, a logging.debug statement is called.

The output we get, uncommenting respective lines for scenario '2' and '3':

1: Flow 'weka.LinearRegression's empty description to console only. This is default behavior for warning level messages. The debug message from the last line is ignored.
2:

[WARNING] [11:51:24:openml.flows.flow] Flow 'weka.LinearRegression's empty description
[DEBUG] [11:51:24:openml.datasets.dataset] Data pickle file already exists and is up to date.

to console only.

3 (the line specifying console output was commented out again):

[WARNING] [11:53:46:openml.flows.flow] Flow 'weka.LinearRegression's empty description
[DEBUG] [11:53:46:openml.datasets.dataset] Data pickle file already exists and is up to date.

written to file, no console output. Note that the warning message is now also not printed to console.

This does not include changes to where the config/cache directories are located.

Writing to the config file is not yet supported. I noticed we hack our way around the configparser module. If we want to support writing to the log file from the python module (which I'm for), we need to look at other options or implement our own (our log file definition is very simple). In fact,
I would further propose refactoring the openml.config module into smaller submodules as a lot of different things are going on within the same module now (loading, logging, test configuration).

I'm willing to do the above refactoring, but it might take a few weeks (I'm very busy until the holidays).

mfeurer

Looks good to me, but I think @mitar should have a look at this, too. Also, I don't understand what the code for storing something in the config is supposed to do?

PGijsbers · 2019-11-28T09:22:20Z

Also, I don't understand what the code for storing something in the config is supposed to do?

Nothing right now (it's not implemented). But the idea is that a user can adjust their log settings (or perhaps all settings, e.g. apikey) programmatically, and persist those settings in the configuration file. This way a user can make permanent configuration changes without having to manually find/edit the configuration file.

mfeurer · 2019-11-28T14:27:37Z

This way a user can make permanent configuration changes without having to manually find/edit the configuration file.

Hm, but wouldn't a user put the configuration of logging in the startup of an application, and want it to be different for different applications using OpenML? Basically, I don't think this is something we should handle in OpenML.

PGijsbers · 2019-11-28T14:36:27Z

Then what is the point of including verbosity options in the configuration file?

mfeurer · 2019-11-28T16:53:32Z

Then what is the point of including verbosity options in the configuration file?

Good point. We decided upon that once upon a time, but I'm not convinced that that was a good idea.

mitar · 2019-12-02T02:06:51Z

I think config file could control only when running inside a interactive interpreter, or Jupyter, or something like that. Something like this could work?

But yes, I am also not convinced that having that in a configuration is reasonable. Because even if you are using interactive interpreter, it might be that openml library is used as a 3rd party to some other library and you are not really expecting to get messages from it.

openml/config.py

mitar · 2019-12-02T02:12:12Z

openml/config.py

+        file_output_level: Optional[int] = None,
+        save_as_default: bool = False
+):
+    """ Set file output to the desired level and register it with openml logger if needed. """


I think "register it with openml logger" is implementation detail. Maybe just saying that "configured logging to a file (if not already) and sets the desired level".

openml/config.py

PGijsbers · 2019-12-03T09:02:17Z

Based on the feedback I would consider finishing the PR with no longer supporting the log level properties of the openml configuration file. This means that each time the user wants to configure openml logging, it either does so through conventional Python usage or by calling one of the helper functions in config.

mfeurer · 2019-12-04T09:23:43Z

Based on the feedback I would consider finishing the PR with no longer supporting the log level properties of the openml configuration file.

Fine by me, maybe we should drop this from the OpenML cache format. @janvanrijn @giuseppec are you using the fact that there's a verbosity flag in the OpenML config file in the Java and R toolkit.

codecov-io · 2019-12-18T14:33:37Z

Codecov Report

Merging #889 into develop will decrease coverage by 0.33%.
The diff coverage is 51.61%.

@@             Coverage Diff             @@
##           develop     #889      +/-   ##
===========================================
- Coverage     88.5%   88.17%   -0.34%     
===========================================
  Files           37       37              
  Lines         4324     4337      +13     
===========================================
- Hits          3827     3824       -3     
- Misses         497      513      +16

Impacted Files	Coverage Δ
openml/config.py	`78.99% <51.61%> (-11.58%)`	⬇️
openml/runs/functions.py	`82.83% <0%> (-0.28%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d5e46fe...ebc0a5f. Read the comment docs.

PGijsbers · 2020-01-08T09:20:35Z

@mfeurer do you know what is going on with these errors? It seems these errors stem from the develop branch.

PGijsbers · 2020-01-08T11:00:35Z

I'm thinking it is due to an expired ssl certificate (https://test.openml.org/ also gives off a warning). I'll ask Joaquin to update the certificate.

PGijsbers · 2020-01-08T14:09:42Z

There seems to be an issue with the scikit-learn downloaded in the Appveyor CI (though it reports it tried to install scikit-learn 0.20).

 @unittest.skipIf(LooseVersion(sklearn.__version__) < "0.20",
                     reason="columntransformer introduction in 0.20.0")
    def test_run_and_upload_column_transformer_pipeline(self):
        import sklearn.compose
>       import sklearn.impute
tests\test_runs\test_run_functions.py:535: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
c:\miniconda35-x64\lib\site-packages\sklearn\impute\__init__.py:4: in <module>
    from ._knn import KNNImputer
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
    import numpy as np
    
    from ._base import _BaseImputer
    from ..utils.validation import FLOAT_DTYPES
    from ..metrics import pairwise_distances
>   from ..metrics.pairwise import _NAN_METRICS
E   ImportError: cannot import name '_NAN_METRICS'
c:\miniconda35-x64\lib\site-packages\sklearn\impute\_knn.py:10: ImportError

mfeurer · 2020-01-08T15:04:33Z

There seems to be an issue with the scikit-learn downloaded in the Appveyor CI (though it reports it tried to install scikit-learn 0.20).

Hm, would increasing the version help here?

mitar · 2020-01-08T16:44:37Z

Use Let's Encrypt and stop worrying about SSL. :-)

PGijsbers · 2020-01-10T10:55:24Z

For Appveyor, scikit-learn now installed through pip instead of conda, and that seems to avoid the issues. Also bumped the scikit-learn version installed with Appveyor to 0.21 (no particular reason as that is still left over from trying to see if changing the version avoided the issue). If you want me to revert it to 0.20, or bump it to 0.22 instead I can do that too.

mfeurer

I'm fine with this. @mitar do you have any further comments?

mitar

Looks good to me! Thanks!

PGijsbers added 2 commits November 26, 2019 11:34

Changes proposed in #885. Don't register handlers by default.

81fa74d

Delay file creation until log emit. Correctly read from config.

3572841

PGijsbers requested review from mfeurer and mitar November 26, 2019 11:03

mfeurer reviewed Nov 27, 2019

View reviewed changes

mitar reviewed Dec 2, 2019

View reviewed changes

Remove loading/storing log level references.

de6a9e6

PGijsbers added 4 commits December 18, 2019 15:38

_create_log_handlers now returns early if called a second time

e3ade22

Fix type errors.

6a4ca29

Update changelog.

6424073

Test remove register file log handler to see if CI works.

4def357

Undo last change. test server ssl works agian.

69fbe50

PGijsbers marked this pull request as ready for review January 8, 2020 14:10

PGijsbers added 3 commits January 9, 2020 10:24

Bump scikit-learn version to 0.22

719edec

Scikit-learn 0.22 does not install properly.

6f2b844

Install scikit-learn through pip instead.

ebc0a5f

mfeurer approved these changes Jan 10, 2020

View reviewed changes

mitar approved these changes Jan 11, 2020

View reviewed changes

PGijsbers merged commit b37b261 into develop Jan 13, 2020

PGijsbers deleted the Fix885 branch January 13, 2020 10:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changes proposed in #885. Don't register handlers by default. #889

Changes proposed in #885. Don't register handlers by default. #889

PGijsbers commented Nov 26, 2019 •

edited

mfeurer left a comment

PGijsbers commented Nov 28, 2019

mfeurer commented Nov 28, 2019

PGijsbers commented Nov 28, 2019

mfeurer commented Nov 28, 2019

mitar commented Dec 2, 2019

mitar Dec 2, 2019

PGijsbers commented Dec 3, 2019

mfeurer commented Dec 4, 2019

codecov-io commented Dec 18, 2019 •

edited

PGijsbers commented Jan 8, 2020

PGijsbers commented Jan 8, 2020

PGijsbers commented Jan 8, 2020

mfeurer commented Jan 8, 2020

mitar commented Jan 8, 2020

PGijsbers commented Jan 10, 2020

mfeurer left a comment

mitar left a comment

Changes proposed in #885. Don't register handlers by default. #889

Changes proposed in #885. Don't register handlers by default. #889

Conversation

PGijsbers commented Nov 26, 2019 • edited

mfeurer left a comment

Choose a reason for hiding this comment

PGijsbers commented Nov 28, 2019

mfeurer commented Nov 28, 2019

PGijsbers commented Nov 28, 2019

mfeurer commented Nov 28, 2019

mitar commented Dec 2, 2019

mitar Dec 2, 2019

Choose a reason for hiding this comment

PGijsbers commented Dec 3, 2019

mfeurer commented Dec 4, 2019

codecov-io commented Dec 18, 2019 • edited

Codecov Report

PGijsbers commented Jan 8, 2020

PGijsbers commented Jan 8, 2020

PGijsbers commented Jan 8, 2020

mfeurer commented Jan 8, 2020

mitar commented Jan 8, 2020

PGijsbers commented Jan 10, 2020

mfeurer left a comment

Choose a reason for hiding this comment

mitar left a comment

Choose a reason for hiding this comment

PGijsbers commented Nov 26, 2019 •

edited

codecov-io commented Dec 18, 2019 •

edited