Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

True graph skeleton as optional algorithm input #60

Open
dmachlanski opened this issue Jun 28, 2023 · 1 comment
Open

True graph skeleton as optional algorithm input #60

dmachlanski opened this issue Jun 28, 2023 · 1 comment
Labels
enhancement New feature or request

Comments

@dmachlanski
Copy link
Contributor

In other words, create an undirected graph from the true DAG (or any graph), and pass it to an algorithm as input. Sampled data should be passed as input as well (as per usual). It's important to note that the passed skeleton is the true undirected graph, not an estimate.

The reason is to be able to test pairwise algorithms, such as these.

Usually, pairwise methods are tested only on (X, Y) datasets. But testing them on bigger graphs (nodes > 2) is arguably more interesting and challenging. This, however, requires to provide the algorithms with a starting point in the form of a graph's skeleton. The task then boils down to orient the edges. The final product is a fully oriented graph (can have cycles), so most, if not all, of the existing metrics can be used without issues.

For an example, see [1] section 5 (5.2 and 5.4 specifically).

[1] O. Goudet, D. Kalainathan, P. Caillou, D. Lopez-Paz, I. Guyon, and M. Sebag, ‘Learning Functional Causal Models with Generative Neural Networks’, Springer International Publishing, 2018. doi: 10.1007/978-3-319-98131-4.

@felixleopoldo
Copy link
Owner

felixleopoldo commented Jun 30, 2023

That sounds like a good idea! The true graph is already part of, the output.adjmat using the function

def alg_output_adjmat_path(algorithm):
The data wildcard pattern matches everything that was used to generate the data, i.e. graph, parameters, and data modules. However this should be splitted as is done in e.g. this function
def time_path(algorithm):
At least the {data} wildcard will have to be renamed so that it doesn't clash with the current one. Once the {adjmat} field is accessible as a wildcard, you use it to create is as a new input variable on the format in e.g.
adjmat = "{output_dir}/adjmat/" + pattern_strings["pcalg_randdag"] + "/seed={replicate}.csv"
where pattern_strings["pcalg_randdag"] is replaced by {adjmat} (I think that is better than taking the raw wildcard {adjmat} string in the script).

@felixleopoldo felixleopoldo added the enhancement New feature or request label Jul 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants