Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TODO #4

Open
6 of 28 tasks
gdkrmr opened this issue Jan 27, 2017 · 8 comments
Open
6 of 28 tasks

TODO #4

gdkrmr opened this issue Jan 27, 2017 · 8 comments

Comments

@gdkrmr
Copy link
Owner

gdkrmr commented Jan 27, 2017

Non-Method Features to add:

  • Rotation Matrix for PCA and ICA
  • mean and std vectors from PCA, ICA and PCA L1
  • Eigenvalues
  • kernel Matrices for kPCA
  • update documentation for .mute in embed
  • add TravisCI script
  • the possibility to use distance or kernel matrices as input.
  • the possibility to use (semi) supervised methods
  • a simple way to register new methods
  • ...

Methods to add:

(Semi) supervised methods:

  • tukey data depth
  • cca
  • opls
  • pls
  • kcca
  • kopls
  • kpls

Please propose more

@topepo
Copy link
Contributor

topepo commented Jan 27, 2017

Those are the main ones that I use (plus MDS) and I'm not familiar with some of the others.

You might want to add (or link to existing) functions to visualize these results.

Any plans on adding non-negative matrix factorizations? The equivalent of their rotations would be nice.

This may be a bit out of scope, but adding some sort of aggregate measures of predictor contribution would be good. For PCA, I've weighted the absolute values of the loadings by their variance contribution (but there are likely to be more theoretically justified methods out there).

@gdkrmr
Copy link
Owner Author

gdkrmr commented Jan 27, 2017

I missed some techniques, e.g. Autoencoders, Non Negative Matrix Factorization, Local Tangent Space Alignment, SNE and derivates, ... I will implement them when I find some time.

Spontaneous idea: for the quality measures keep only the axis of interest in the original matrix and all axes in the reduced dimensions and compare the outcome between axes of interest. No idea, if this works or if it has a sound theoretical basis.

For linear techniques just look at the values in the rotation matrix, higher values in rotation[Var1, PCA1] mean a higher contribution of variable Var1 to axis PCA1.

@gdkrmr gdkrmr changed the title convenience functions to extract interesting information from dimRedResult objects TODO Feb 1, 2017
@gdkrmr
Copy link
Owner Author

gdkrmr commented Oct 11, 2017

first commit for the autoencoder ed64756

@gdkrmr
Copy link
Owner Author

gdkrmr commented May 26, 2018

there is a really cool new one:
umap: https://github.com/ropenscilabs/umapr
https://arxiv.org/abs/1802.03426
currently only a wrapper with reticulate around a python implementation,
there is also another package: https://github.com/tkonopka/umap
both not on CRAN yet

@gdkrmr
Copy link
Owner Author

gdkrmr commented May 28, 2018

there is also largeVis: LargeVis which already is on cran
EDIT: largeVis got archived

@gdkrmr
Copy link
Owner Author

gdkrmr commented Oct 1, 2018

Add methods KECA and OKECA.

@khughitt
Copy link
Contributor

khughitt commented Feb 5, 2019

Some other possible methods to consider:

Incidentally, there is another UMAP implementation that is pretty well-documented and appears to do a good job mirroring the original Python API: https://github.com/jlmelville/uwot

@gdkrmr
Copy link
Owner Author

gdkrmr commented Feb 5, 2019

Some other possible methods to consider:

* Sparse PCA
  
  * [sparsepca](https://cran.r-project.org/web/packages/sparsepca/index.html)
  * [nsprcomp](https://cran.r-project.org/web/packages/nsprcomp/nsprcomp.pdf)

* Robust PCA
  
  * [FastHCS](https://cran.r-project.org/web/packages/FastHCS/FastHCS.pdf)

* Robust Sparse PCA
  
  * [rospca](https://cran.r-project.org/web/packages/rospca/index.html)
  * [rpca](https://cran.r-project.org/web/packages/rpca/index.html)

* Other
  
  * [whitening](https://cran.r-project.org/web/packages/whitening/index.html)

Sounds great! To save copying code, there should probably be a single pca class with different backends and the PCAL1 class should be part of it.

Incidentally, there is another UMAP implementation that is pretty well-documented and appears to do a good job mirroring the original Python API: https://github.com/jlmelville/uwot

This one sounds really promising. Python dependencies always cause trouble, swapping them for a native R package is always welcome! The package still has to be released on CRAN.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants