Warning messages in unit test: "invalid value encountered in double_scalars" and others #311

dsherry · 2020-01-08T23:36:13Z

Problem
Running locally, (on python 3.8, but I've seen similar on other python versions):

(featurelabs) ➜  evalml git:(master) pytest -v evalml/tests/automl_tests/test_autobase.py::test_pipeline_limits
====================================================================== test session starts ======================================================================
platform darwin -- Python 3.8.0, pytest-4.4.1, py-1.8.0, pluggy-0.13.1 -- /Users/dsherry/anaconda/envs/featurelabs/bin/python
cachedir: .pytest_cache
rootdir: /Users/dsherry/development/aie/featurelabs/evalml, inifile: setup.cfg
plugins: xdist-1.26.1, cov-2.6.1, nbval-0.9.3, forked-1.1.3
collected 1 item

evalml/tests/automl_tests/test_autobase.py::test_pipeline_limits PASSED       [100%]

======================================================================= warnings summary ========================================================================
evalml/tests/automl_tests/test_autobase.py::test_pipeline_limits
evalml/tests/automl_tests/test_autobase.py::test_pipeline_limits
...
  /Users/dsherry/anaconda/envs/featurelabs/lib/python3.8/site-packages/sklearn/metrics/classification.py:1436: UndefinedMetricWarning:

  Precision is ill-defined and being set to 0.0 due to no predicted samples.
...
  /Users/dsherry/anaconda/envs/featurelabs/lib/python3.8/site-packages/sklearn/metrics/classification.py:1436: UndefinedMetricWarning:

  F-score is ill-defined and being set to 0.0 due to no predicted samples.
...
  /Users/dsherry/anaconda/envs/featurelabs/lib/python3.8/site-packages/sklearn/metrics/classification.py:872: RuntimeWarning:

  invalid value encountered in double_scalars
...
  /Users/dsherry/development/aie/featurelabs/evalml/evalml/automl/auto_base.py:307: RuntimeWarning:

  invalid value encountered in double_scalars

Three warnings coming from sklearn and one coming from our code. I seem to get a slightly different combination/order of warnings every time I run the test.

More info
Here's the line the last warning is coming from, in AutoBase._add_result:
high_variance_cv = (scores.std() / scores.mean()) > .2

I suspect scores is all empty or all 0. But why? This is the next thing to look into. Perhaps empty test data when we're scoring the model?

My suspicion is that this dataset is too small or too uniform, and that the models trained on it are predicting all the same value or something like that. If I'm right, this reinforces that we need guard rails to detect this problem when the user uploads their data, and that mocking in the unit tests to avoid actual fitting (#275 ) is important (even if this particular test isn't mockable).

I encountered these warnings while debugging bug #167 , so it's possible this is related to that.

Next steps
We should determine why these warnings are showing up. If it's a problem with the test setup, let's change the test to fix or avoid it. Otherwise, it could be a bug. We shouldn't be printing out warnings like this under normal use anyways.

The text was updated successfully, but these errors were encountered:

jeremyliweishih · 2020-04-14T15:18:19Z

This doesn't seem to show up in master after #445 was merged in. Test for 3.8 can be seen here. @dsherry not too sure why it disappeared with objectives merged in but should I close for now?

dsherry · 2020-04-14T17:24:41Z

@jeremyliweishih hm, weird! Yeah I don't see that particular warning about double_scalar. Perhaps #445 shuffled the unit tests around in just the right way.

I do see this in the circleci job you linked to:

=============================== warnings summary ===============================
evalml/utils/gen_utils.py:98
  /home/circleci/evalml/evalml/utils/gen_utils.py:98: RuntimeWarning: invalid value encountered in true_divide
    conf_mat = conf_mat.astype('float') / conf_mat.sum(axis=0)

test_python/lib/python3.8/site-packages/numpy/core/_methods.py:38
  /home/circleci/evalml/test_python/lib/python3.8/site-packages/numpy/core/_methods.py:38: ComplexWarning: Casting complex values to real discards the imaginary part
    return umr_sum(a, axis, dtype, out, keepdims, initial, where)

Let's get rid of those, yeah? Could be covering up bugs.

Suggestion for the first: add "try except RuntimeWarning as e: assert False, e" in evalml/evalml/utils/gen_utils.py:98, run that on circleci and see where it breaks. For the second one, not sure. Maybe there's a way to have unit tests fail if they throw warnings?

angela97lin · 2020-04-14T20:14:24Z

@dsherry @jeremyliweishih I talked to Jeremy about this but I think the PR I'm currently doing takes care of the second warning! :) (#638)

dsherry added the bug Issues tracking problems with existing features. label Jan 8, 2020

dsherry changed the title ~~Warning messages in unit test: invalid value encountered in double_scalars and others~~ Warning messages in unit test: "invalid value encountered in double_scalars" and others Jan 8, 2020

dsherry mentioned this issue Jan 9, 2020

Flaky unit tests: joblib parallel ValueError #167

Closed

dsherry mentioned this issue Mar 11, 2020

Suppress sklearn UndefinedMetric warning from stdout (F1 score) #436

Closed

jeremyliweishih self-assigned this Apr 13, 2020

jeremyliweishih mentioned this issue Apr 14, 2020

Add error case in normalize_matrix #640

Merged

dsherry mentioned this issue Apr 14, 2020

Set raise_error to default to True for AutoML classes #638

Merged

jeremyliweishih closed this as completed in #640 Apr 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Warning messages in unit test: "invalid value encountered in double_scalars" and others #311

Warning messages in unit test: "invalid value encountered in double_scalars" and others #311

dsherry commented Jan 8, 2020 •

edited

jeremyliweishih commented Apr 14, 2020

dsherry commented Apr 14, 2020

angela97lin commented Apr 14, 2020 •

edited

Warning messages in unit test: "invalid value encountered in double_scalars" and others #311

Warning messages in unit test: "invalid value encountered in double_scalars" and others #311

Comments

dsherry commented Jan 8, 2020 • edited

jeremyliweishih commented Apr 14, 2020

dsherry commented Apr 14, 2020

angela97lin commented Apr 14, 2020 • edited

dsherry commented Jan 8, 2020 •

edited

angela97lin commented Apr 14, 2020 •

edited