-
-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Fix LocalOutlierFactor's output for data with duplicated samples #28773
Open
HenriqueProj
wants to merge
8
commits into
scikit-learn:main
Choose a base branch
from
HenriqueProj:fix_lof_duplicate_samples
base: main
Could not load branches
Branch not found: {{ refName }}
Could not load tags
Nothing to show
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
[MRG] Fix LocalOutlierFactor's output for data with duplicated samples #28773
HenriqueProj
wants to merge
8
commits into
scikit-learn:main
from
HenriqueProj:fix_lof_duplicate_samples
Commits on Apr 5, 2024
-
Fix scikit-learn#27839: Adjust LocalOutlierFactor for data with dupli…
…cated samples Previously, when the dataset had values repeat more times than the algorithm's number of neighbors, it miscalculates the outliers. Because the distance between the duplicated samples is 0, the local reachability density is equal to 1e10. This leads to values that are close to the duplicated values having a really low negative outlier factor (under -1e7), labeling them as outliers. This fix checks if the minimum negative outlier factor is under -1e7 and, if so, raises the number of neighbors to the number of occurrences of the most frequent value + 1, also raising a warning. Notes: Added a handle_duplicates variable, which allows developers to manually handle the duplicate values, if desired. Also added a memory_limit variable to avoid creating memory errors for really large datasets, which can also be changed manually by developers.
Configuration menu - View commit details
-
Copy full SHA for e754830 - Browse repository at this point
Copy the full SHA e754830View commit details
Commits on Apr 8, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 19cb411 - Browse repository at this point
Copy the full SHA 19cb411View commit details -
Configuration menu - View commit details
-
Copy full SHA for bc069b6 - Browse repository at this point
Copy the full SHA bc069b6View commit details -
Configuration menu - View commit details
-
Copy full SHA for c6470c6 - Browse repository at this point
Copy the full SHA c6470c6View commit details
Commits on Apr 10, 2024
-
Fix: Changed approach according to review
Removed automatic change to neighbors number and changed the warning Also changed the associated test, to catch the warning.
Configuration menu - View commit details
-
Copy full SHA for 909b25c - Browse repository at this point
Copy the full SHA 909b25cView commit details
Commits on Apr 22, 2024
-
Update sklearn/neighbors/_lof.py
Changed comment according to review Co-authored-by: Tim Head <betatim@gmail.com>
Configuration menu - View commit details
-
Copy full SHA for de442f0 - Browse repository at this point
Copy the full SHA de442f0View commit details
Commits on May 20, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 50eb839 - Browse repository at this point
Copy the full SHA 50eb839View commit details
Commits on May 27, 2024
-
Configuration menu - View commit details
-
Copy full SHA for b2f79c5 - Browse repository at this point
Copy the full SHA b2f79c5View commit details
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.