Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I could add cluster background knowledge #123

Open
JanMarcoRuizdeVargas opened this issue Jul 10, 2023 · 2 comments
Open

I could add cluster background knowledge #123

JanMarcoRuizdeVargas opened this issue Jul 10, 2023 · 2 comments

Comments

@JanMarcoRuizdeVargas
Copy link
Contributor

Hi all,
in my masterthesis I am currently working on causal discovery with cluster DAGs ( https://arxiv.org/abs/2202.12263 ). For that I am implementing a version of PC and FCI that can take in cluster background knowledge to aid in causal discovery.
My package is under https://github.com/JanMarcoRuizdeVargas/clustercausal
I am implementing it on top of your causal-learn architecture.
My question now is, would you be interested in incorporating my code into causal-learn? Ofc I am writing tests, am adding example notebooks and am using black formatter to ensure high coding quality.
Let me know what you think.
Best,
Jan Marco

@adam2392
Copy link
Collaborator

@JanMarcoRuizdeVargas to my knowledge causal discovery of clusters is a difficult and open problem. Is there a paper where one claims they can do causal discovery of clusters where the characterization of the equivalence class is a valid object?

@JanMarcoRuizdeVargas
Copy link
Contributor Author

JanMarcoRuizdeVargas commented Aug 1, 2023

Sorry, my original message wasn't entirely clear.

I am using clusters as a form of background knowledge, as described in the paper above. Therefore, no cluster discovery is needed - it is a priori background knowledge. I am working to show that cluster background knowledge is similar/equivalent to previously used background knowledge (such as tiered, https://proceedings.mlr.press/v108/andrews20a.html, pairwise, https://arxiv.org/abs/2207.05067, and typed, https://proceedings.mlr.press/v177/brouillard22a/brouillard22a.pdf). Pairwise and tiered are also the type of background knowledge currently supported in causallearn.

So why cluster background knowledge? It makes background knowledge much more visualizable with the C-DAG and therefore less cumbersome to define. It is also compatible with other forms of background knowledge and in addition, in my algorithm, the C-DAG is directly used during skeleton discovery, speeding up the process.

Regarding causal discovery of clusters, that indeed is a difficult and open problem. To my knowledge there exists this (https://arxiv.org/abs/2202.12263) for learning C-DAGs from data, but there is no theoretical guarantees or criteria for good or correct clusterings. In their paper, they aim to cluster variables with synergistic effects, using a mutual information criterion. There is also a paper on learning something similar to C-DAGs if the DAG is known (https://www.sciencedirect.com/science/article/pii/S0888613X17303134). In the future, I would also be open to collaborating on implementing causal discovery of clusters.

Please let me know what you think. My thesis is also linked in the github readme, but as it is WIP, the information on what I am doing is not yet presented clearly and concisely.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants