Fixed adapting tolerance value according to data type in manifold MDS #11167

urvang96 · 2018-05-30T18:53:46Z

#### Reference Issues
Fixes #11159 .
check_symmetric: adapt tolerance to data type

#### What does this implement/fix? Explain your changes.
In manifold MDS set the tolerance values for the check_symmetric function as per the data type of input.

#### Any other comments?
Added a test.

amueller · 2018-06-01T17:00:23Z

sklearn/manifold/mds.py

@@ -67,7 +67,15 @@ def _smacof_single(dissimilarities, metric=True, n_components=2, init=None,
    n_iter : int
        The number of iterations corresponding to the best stress.
    """
-    dissimilarities = check_symmetric(dissimilarities, raise_exception=True)
+    if dissimilarities.dtype == np.float16:


Why these values? Why not dtype.eps? And input validation should probably not be this far down the stack.

@Celelibi suggested using these values, so I tried using these values.

I think using the build-in eps or some factor of that would probably be more principled?

Okay! I will try using eps rather than hard coding these values.

jnothman · 2018-06-04T00:54:39Z

Tests are still failing. Let us know if you need help.

urvang96 · 2018-06-04T06:36:47Z

@jnothman I am facing two issues in this PR.

When I am using eps values it gives a tol value of 1.9E-7 for float32, but check_symmetric function gives error in that case. I tried different values and I think (eps*100) that is 1.9E-5 will solve the issue.
When I am giving a float16 input to fit_transform function, the dissimilarity matrix given after Euclidean distance calculation is float64. So I am not able to cover float16 test case.

jnothman · 2018-06-04T12:30:58Z

Yes, don't worry about float16 for now. I would've expected eps may be too small, but I've not looked into the context.

jnothman · 2018-06-04T12:41:31Z

Indeed, 1e-5 might make sense for float32 if 1e-10 is used for float64. But I suspect that what we're actually seeing here is a bug in euclidean_distances when working with float32, not merely an issue with the tol of check_symmetric. Really, we need to fix #9354 before we can work on this PR with euclidean distance as a test case.

Celelibi · 2018-06-05T12:12:21Z

Indeed, 1e-5 might make sense for float32 if 1e-10 is used for float64.

In my tests, 1e-5 might be too large with float32 in some cases with the current implementation of euclidean_distances. Here is a test-case showing an error greater than 2e-4.

>>> a = np.array([[0.1015854999423027, 0.9949246048927307], [0.1018471047282219, 0.9947648644447327]], dtype=np.float32)
>>> euclidean_distances(a)
array([[0.        , 0.00024414],
       [0.        , 0.        ]], dtype=float32)

jnothman · 2018-06-05T23:48:27Z

yes, let's put this on hold as i suspect the issue is not here but in euclidean_distances which was naively adapted to 32bit for the latest release

amueller · 2019-08-05T18:49:17Z

Fixed by #13554, I think. We can reopen if the issue persists.

adapting tolerance value according to data type in mainfold MDS

ee8fe88

urvang96 changed the title ~~Fixed adapting tolerance value according to data type in mainfold MDS~~ Fixed adapting tolerance value according to data type in manifold MDS May 30, 2018

Coverage Changes

3f54c58

amueller reviewed Jun 1, 2018

View reviewed changes

urvang96 added 4 commits June 2, 2018 10:29

Set tolerance value to the eps of dtype

69f8570

Set tolerance value to the eps of dtype

a8aca7c

Set tolerance value to the eps of float dtype

aeb95a0

Fixes test

37965f0

amueller closed this Aug 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed adapting tolerance value according to data type in manifold MDS #11167

Fixed adapting tolerance value according to data type in manifold MDS #11167

urvang96 commented May 30, 2018

amueller Jun 1, 2018

urvang96 Jun 1, 2018 •

edited

amueller Jun 1, 2018

urvang96 Jun 1, 2018 •

edited

jnothman commented Jun 4, 2018

urvang96 commented Jun 4, 2018 •

edited

jnothman commented Jun 4, 2018 via email

jnothman commented Jun 4, 2018 via email

Celelibi commented Jun 5, 2018

jnothman commented Jun 5, 2018 via email

amueller commented Aug 5, 2019

Fixed adapting tolerance value according to data type in manifold MDS #11167

Fixed adapting tolerance value according to data type in manifold MDS #11167

Conversation

urvang96 commented May 30, 2018

amueller Jun 1, 2018

Choose a reason for hiding this comment

urvang96 Jun 1, 2018 • edited

Choose a reason for hiding this comment

amueller Jun 1, 2018

Choose a reason for hiding this comment

urvang96 Jun 1, 2018 • edited

Choose a reason for hiding this comment

jnothman commented Jun 4, 2018

urvang96 commented Jun 4, 2018 • edited

jnothman commented Jun 4, 2018 via email

jnothman commented Jun 4, 2018 via email

Celelibi commented Jun 5, 2018

jnothman commented Jun 5, 2018 via email

amueller commented Aug 5, 2019

urvang96 Jun 1, 2018 •

edited

urvang96 Jun 1, 2018 •

edited

urvang96 commented Jun 4, 2018 •

edited