Random subsampling for feature selection #47 #650

xliu833 · 2019-12-21T22:04:00Z

Description

Related issues or pull requests

Fixes #47

Pull Request Checklist

Added a note about the modification or contribution to the ./docs/sources/CHANGELOG.md file (if applicable)
Added appropriate unit test functions in the ./mlxtend/*/tests directories (if applicable)
Modify documentation in the corresponding Jupyter Notebook under mlxtend/docs/sources/ (if applicable)
Ran PYTHONPATH='.' pytest ./mlxtend -sv and make sure that all unit tests pass (for small modifications, it might be sufficient to only run the specific test file, e.g., PYTHONPATH='.' pytest ./mlxtend/classifier/tests/test_stacking_cv_classifier.py -sv)
Checked for style issues by running flake8 ./mlxtend

pep8speaks · 2019-12-21T22:04:03Z

Hello @lareinaxy! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-12-21 22:28:28 UTC

rasbt · 2019-12-22T16:39:48Z

mlxtend/feature_selection/sequential_feature_selector.py

@@ -382,7 +382,8 @@ def fit(self, X, y, custom_feature_names=None, groups=None, **fit_params):

        else:
            select_in_range = False
-            k_to_select = self.k_features
+            k_to_select = int(len(X[1])**.5)
+            np.take(X, np.random.permutation(X.shape[1]), axis=1, out=X)


Hi Lareina, thanks for the PR. I don't quite understand how the value is chosen k_to_select = int(len(X[1])**.5). So this means that the random subset is always sqrt of the original number of features?

There are two problems with that.

We want to add the random feature subset size as an optional thing

I think we should let the user allow to choose the subset size

There could be a new parameter

use_random_feature_subset for the SequentialFeatureSelector class which either accepts a function like f = lambda x: int(np.sqrt(x) or None.

Let me know if you have questions.

15th attempt

2dc146e

xliu833 changed the title ~~15th attempt~~ Random subsampling for feature selection #47 Dec 21, 2019

formated

0b079f7

rasbt requested changes Dec 22, 2019

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Random subsampling for feature selection #47 #650

Random subsampling for feature selection #47 #650

xliu833 commented Dec 21, 2019 •

edited by rasbt

pep8speaks commented Dec 21, 2019 •

edited

rasbt Dec 22, 2019

Random subsampling for feature selection #47 #650

Are you sure you want to change the base?

Random subsampling for feature selection #47 #650

Conversation

xliu833 commented Dec 21, 2019 • edited by rasbt

Description

Related issues or pull requests

Pull Request Checklist

pep8speaks commented Dec 21, 2019 • edited

Comment last updated at 2019-12-21 22:28:28 UTC

rasbt Dec 22, 2019

Choose a reason for hiding this comment

xliu833 commented Dec 21, 2019 •

edited by rasbt

pep8speaks commented Dec 21, 2019 •

edited