Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to DisGeNet 5? #22

Open
tonigi opened this issue Dec 11, 2017 · 2 comments
Open

Update to DisGeNet 5? #22

tonigi opened this issue Dec 11, 2017 · 2 comments

Comments

@tonigi
Copy link

tonigi commented Dec 11, 2017

I am under the impression that the DGN database included with the package corresponds to DisGeNet release 4. Would it be possible to upgrade it to version 5? There is now a distribution in sqlite format which may make the transition easier.

A more complex task would be to keep the "Evidence index" or "Source" annotations, in order to filter out weak/negative associations.

Thanks!

@tonigi
Copy link
Author

tonigi commented Dec 11, 2017

Actually, I was able to regenerate them with minor changes to your script. In particular, got the file ALL variant-disease associations and changed inst/extdata/build_DGN_Anno.R as follows

x <- read.delim("all_gene_disease_associations.tsv.gz", comment.char="#",
            stringsAsFactor=F, fileEncoding="ISO-8859-1")  
d2n <- unique(x[, c(3, 4)])  # New columns - using names would be better
d2g <- unique(x[, c(3, 1)])
# No need to special-case non-ASCII chars any more

Still needs checking; e.g., somewhat oddly, the total number of annotations is now 17074 (before was 17381).

@GuangchuangYu
Copy link
Member

If you figure it out, a PR is welcome.

GuangchuangYu pushed a commit that referenced this issue Dec 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants