Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed adapting tolerance value according to data type in manifold MDS #11167

Closed
wants to merge 6 commits into from

Conversation

urvang96
Copy link
Contributor

#### Reference Issues
Fixes #11159 .
check_symmetric: adapt tolerance to data type

#### What does this implement/fix? Explain your changes.
In manifold MDS set the tolerance values for the check_symmetric function as per the data type of input.

#### Any other comments?
Added a test.

@urvang96 urvang96 changed the title Fixed adapting tolerance value according to data type in mainfold MDS Fixed adapting tolerance value according to data type in manifold MDS May 30, 2018
@@ -67,7 +67,15 @@ def _smacof_single(dissimilarities, metric=True, n_components=2, init=None,
n_iter : int
The number of iterations corresponding to the best stress.
"""
dissimilarities = check_symmetric(dissimilarities, raise_exception=True)
if dissimilarities.dtype == np.float16:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why these values? Why not dtype.eps? And input validation should probably not be this far down the stack.

Copy link
Contributor Author

@urvang96 urvang96 Jun 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Celelibi suggested using these values, so I tried using these values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using the build-in eps or some factor of that would probably be more principled?

Copy link
Contributor Author

@urvang96 urvang96 Jun 1, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay! I will try using eps rather than hard coding these values.

@jnothman
Copy link
Member

jnothman commented Jun 4, 2018

Tests are still failing. Let us know if you need help.

@urvang96
Copy link
Contributor Author

urvang96 commented Jun 4, 2018

@jnothman I am facing two issues in this PR.

  1. When I am using eps values it gives a tol value of 1.9E-7 for float32, but check_symmetric function gives error in that case. I tried different values and I think (eps*100) that is 1.9E-5 will solve the issue.

  2. When I am giving a float16 input to fit_transform function, the dissimilarity matrix given after Euclidean distance calculation is float64. So I am not able to cover float16 test case.

@jnothman
Copy link
Member

jnothman commented Jun 4, 2018 via email

@jnothman
Copy link
Member

jnothman commented Jun 4, 2018 via email

@Celelibi
Copy link

Celelibi commented Jun 5, 2018

Indeed, 1e-5 might make sense for float32 if 1e-10 is used for float64.

In my tests, 1e-5 might be too large with float32 in some cases with the current implementation of euclidean_distances. Here is a test-case showing an error greater than 2e-4.

>>> a = np.array([[0.1015854999423027, 0.9949246048927307], [0.1018471047282219, 0.9947648644447327]], dtype=np.float32)
>>> euclidean_distances(a)
array([[0.        , 0.00024414],
       [0.        , 0.        ]], dtype=float32)

@jnothman
Copy link
Member

jnothman commented Jun 5, 2018 via email

@amueller
Copy link
Member

amueller commented Aug 5, 2019

Fixed by #13554, I think. We can reopen if the issue persists.

@amueller amueller closed this Aug 5, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

check_symmetric: adapt tolerance to data type
4 participants