Kernel Conditional Independence Tests #226

rflperry · 2021-10-26T09:48:37Z

Testing for conditional independence, X \indep Y | Z, is a common problem within causal discovery and feature selection. The following two kernel-based methods are able to perform this test without too many assumptions.

Kernel Conditional Independence (KCI) Test [paper][matlab code]

Well known and commonly used in practice from my understanding.
Computes kernel matrices in each of the variables X, Y, Z to compute a test statistic.
Approximates the null distribution using a Gamma distribution. No permutation test available.
[Edit] Python code in this package

A Permutation-Based Kernel Conditional Independence (KCIP) Test [paper][matlab code]

Potentially an improvement to KCI but not as widely used or known, partially due to speed constraints.
Computes kernel matrices in each of the variables X, Y, Z.
Also provides a two-layer bootstrap permutation test by:
- Finding a permutation Y' of Y based on minimizing the permuted Z distances.
- Performing a two-sample test (MMD) on the original (X, Y, X) and permuted (X, Y', Z)
Improves upon KCI when it's null is not well specified (compelx, higher-dimension Z), or if Z can be clustered well or is discrete.
Also provides analytic approximates the null distribution using a Gamma distribution.

A nonparametric test based on regression error (FIT) [paper] [python code]

A bit more fringe than KCI/KCIP but provides good simulation comparisons between all three methods plus more.
Uses a nonparametric regression (in their case, a decision tree) to examine the change in predictive power based on including versus excluding some variables Z.
Uses the mean squared error as a test statistic and an analytic Gaussian/T-test approach to compute a pvalue
Seemingly efficient for large samples sizes as compared to other kernel based approaches.
Interesting connections in that trees/forests are adaptive kernel methods and extensions to forests/honesty/leaf permutations.

zdbzdb123123 · 2022-02-03T17:34:24Z

Interested

sampan501 · 2022-02-03T17:55:55Z

@zdbzdb123123 which one? Once you have decided, please make a new issue with the description and link to this issue

zdbzdb123123 · 2022-02-03T18:02:14Z

KCI, and will do

rflperry · 2022-02-04T10:36:41Z

I also discovered a package with python code, matlab wrappers.

KCI code
KCIP code
The package has some other things, including a small notebook with simulations to test the tools.

MatthewZhao26 · 2022-02-09T05:21:40Z

Interested in FIT

rflperry added the enhancement New feature or request label Oct 26, 2021

zdbzdb123123 mentioned this issue Feb 3, 2022

Kernel Conditional Independence Test #301

Open

MatthewZhao26 mentioned this issue Feb 9, 2022

A nonparametric test based on regression error (FIT) #305

Open

sampan501 changed the title ~~Kernel Conditional Independence Test~~ Kernel Conditional Independence Tests Feb 14, 2022

sampan501 added the ndd Issues for NeuroData Design label Feb 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kernel Conditional Independence Tests #226

Kernel Conditional Independence Tests #226

rflperry commented Oct 26, 2021 •

edited

zdbzdb123123 commented Feb 3, 2022

sampan501 commented Feb 3, 2022

zdbzdb123123 commented Feb 3, 2022

rflperry commented Feb 4, 2022

MatthewZhao26 commented Feb 9, 2022

Kernel Conditional Independence Tests #226

Kernel Conditional Independence Tests #226

Comments

rflperry commented Oct 26, 2021 • edited

zdbzdb123123 commented Feb 3, 2022

sampan501 commented Feb 3, 2022

zdbzdb123123 commented Feb 3, 2022

rflperry commented Feb 4, 2022

MatthewZhao26 commented Feb 9, 2022

rflperry commented Oct 26, 2021 •

edited