Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warning messages in unit test: "invalid value encountered in double_scalars" and others #311

Closed
dsherry opened this issue Jan 8, 2020 · 3 comments · Fixed by #640
Closed
Assignees
Labels
bug Issues tracking problems with existing features.

Comments

@dsherry
Copy link
Contributor

dsherry commented Jan 8, 2020

Problem
Running locally, (on python 3.8, but I've seen similar on other python versions):

(featurelabs) ➜  evalml git:(master) pytest -v evalml/tests/automl_tests/test_autobase.py::test_pipeline_limits
====================================================================== test session starts ======================================================================
platform darwin -- Python 3.8.0, pytest-4.4.1, py-1.8.0, pluggy-0.13.1 -- /Users/dsherry/anaconda/envs/featurelabs/bin/python
cachedir: .pytest_cache
rootdir: /Users/dsherry/development/aie/featurelabs/evalml, inifile: setup.cfg
plugins: xdist-1.26.1, cov-2.6.1, nbval-0.9.3, forked-1.1.3
collected 1 item

evalml/tests/automl_tests/test_autobase.py::test_pipeline_limits PASSED       [100%]

======================================================================= warnings summary ========================================================================
evalml/tests/automl_tests/test_autobase.py::test_pipeline_limits
evalml/tests/automl_tests/test_autobase.py::test_pipeline_limits
...
  /Users/dsherry/anaconda/envs/featurelabs/lib/python3.8/site-packages/sklearn/metrics/classification.py:1436: UndefinedMetricWarning:

  Precision is ill-defined and being set to 0.0 due to no predicted samples.
...
  /Users/dsherry/anaconda/envs/featurelabs/lib/python3.8/site-packages/sklearn/metrics/classification.py:1436: UndefinedMetricWarning:

  F-score is ill-defined and being set to 0.0 due to no predicted samples.
...
  /Users/dsherry/anaconda/envs/featurelabs/lib/python3.8/site-packages/sklearn/metrics/classification.py:872: RuntimeWarning:

  invalid value encountered in double_scalars
...
  /Users/dsherry/development/aie/featurelabs/evalml/evalml/automl/auto_base.py:307: RuntimeWarning:

  invalid value encountered in double_scalars

Three warnings coming from sklearn and one coming from our code. I seem to get a slightly different combination/order of warnings every time I run the test.

More info
Here's the line the last warning is coming from, in AutoBase._add_result:
high_variance_cv = (scores.std() / scores.mean()) > .2

I suspect scores is all empty or all 0. But why? This is the next thing to look into. Perhaps empty test data when we're scoring the model?

My suspicion is that this dataset is too small or too uniform, and that the models trained on it are predicting all the same value or something like that. If I'm right, this reinforces that we need guard rails to detect this problem when the user uploads their data, and that mocking in the unit tests to avoid actual fitting (#275 ) is important (even if this particular test isn't mockable).

I encountered these warnings while debugging bug #167 , so it's possible this is related to that.

Next steps
We should determine why these warnings are showing up. If it's a problem with the test setup, let's change the test to fix or avoid it. Otherwise, it could be a bug. We shouldn't be printing out warnings like this under normal use anyways.

@dsherry dsherry added the bug Issues tracking problems with existing features. label Jan 8, 2020
@dsherry dsherry changed the title Warning messages in unit test: invalid value encountered in double_scalars and others Warning messages in unit test: "invalid value encountered in double_scalars" and others Jan 8, 2020
@jeremyliweishih jeremyliweishih self-assigned this Apr 13, 2020
@jeremyliweishih
Copy link
Contributor

This doesn't seem to show up in master after #445 was merged in. Test for 3.8 can be seen here. @dsherry not too sure why it disappeared with objectives merged in but should I close for now?

@dsherry
Copy link
Contributor Author

dsherry commented Apr 14, 2020

@jeremyliweishih hm, weird! Yeah I don't see that particular warning about double_scalar. Perhaps #445 shuffled the unit tests around in just the right way.

I do see this in the circleci job you linked to:

=============================== warnings summary ===============================
evalml/utils/gen_utils.py:98
  /home/circleci/evalml/evalml/utils/gen_utils.py:98: RuntimeWarning: invalid value encountered in true_divide
    conf_mat = conf_mat.astype('float') / conf_mat.sum(axis=0)

test_python/lib/python3.8/site-packages/numpy/core/_methods.py:38
  /home/circleci/evalml/test_python/lib/python3.8/site-packages/numpy/core/_methods.py:38: ComplexWarning: Casting complex values to real discards the imaginary part
    return umr_sum(a, axis, dtype, out, keepdims, initial, where)

Let's get rid of those, yeah? Could be covering up bugs.

Suggestion for the first: add "try except RuntimeWarning as e: assert False, e" in evalml/evalml/utils/gen_utils.py:98, run that on circleci and see where it breaks. For the second one, not sure. Maybe there's a way to have unit tests fail if they throw warnings?

@angela97lin
Copy link
Contributor

angela97lin commented Apr 14, 2020

@dsherry @jeremyliweishih I talked to Jeremy about this but I think the PR I'm currently doing takes care of the second warning! :) (#638)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issues tracking problems with existing features.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants