Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] exclude dbgap samples when using search #112

Open
Maarten-vd-Sande opened this issue Feb 3, 2021 · 4 comments
Open

[ENH] exclude dbgap samples when using search #112

Maarten-vd-Sande opened this issue Feb 3, 2021 · 4 comments
Labels
enhancement New feature or request

Comments

@Maarten-vd-Sande
Copy link
Contributor

Maarten-vd-Sande commented Feb 3, 2021

Is your feature request related to a problem? Please describe.
Not sure if it is already implemented, but would it be possible to exclude private/dbgap samples when using the search functionality?

If not yet supported, that would be a very useful functionality for me :)

@Maarten-vd-Sande Maarten-vd-Sande added the enhancement New feature or request label Feb 3, 2021
@saketkc
Copy link
Owner

saketkc commented Feb 4, 2021

This is good idea! Paging @bscrow if he is willing to help, else I will add it to my list of ToDos.

@bscrow
Copy link
Collaborator

bscrow commented Feb 5, 2021

Based on the current implementation, we can exclude private samples from the search by adding "AND Public[Access]" to the search text. For eg, "bacterium AND nanopore" -> "bacterium AND nanopore AND Public[Access]". dbgap samples can be excluded by appending "NOT cluster dbgap[Properties]"

If this is a frequently used feature, maybe we can implement a separate flag for this functionality

@Maarten-vd-Sande
Copy link
Contributor Author

Maarten-vd-Sande commented Feb 5, 2021

My guess would be that this is something many people would be interested in. Does this then already work with the Python API? Something like:

from pysradb.search import SraSearch

instance = SraSearch(query="h3k27ac AND Public[Access]", selection="chip", publication_date="01-01-2015:01-01-2021", platform="illumina", organism="Homo sapiens", verbosity=2, return_max=1_000_000)
instance.search()

When I use this query I get less experiments, which sort-of indicates it works?

p.s. so far the search functionality works really nice, thanks for the feature @bscrow

@bscrow
Copy link
Collaborator

bscrow commented Feb 5, 2021

Yep that will work by keeping only entries with publically available data. From what I can see on SRA it's slightly different from excluding dbgap samples as some dbgap samples are classified as public.

I'm really glad that you like the search functionality! I'll try to see if I can implement the option to exclude dbgap samples from a search.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants