Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Control output value of orthologs function #24

Open
janstrauss1 opened this issue Sep 16, 2020 · 1 comment
Open

Control output value of orthologs function #24

janstrauss1 opened this issue Sep 16, 2020 · 1 comment

Comments

@janstrauss1
Copy link

janstrauss1 commented Sep 16, 2020

Hi @HajkD ,

I'm using the orthologs function with ortho_detection = "RBH" to detect orthologs for a query_file containing 8 protein fasta sequences in multiple subject_files.

For easier downstream data parsing, I would be very interested to set the orthologs function in a way to output results for any query_id including those queries that did not give any hits (i.e. fill result table with NA).

I would appreciate any help how to achieve this.

Many thanks in advance!

Jan

@HajkD
Copy link
Member

HajkD commented Sep 16, 2020

Hi Jan,

Many thanks for contacting me for this.

Would it be possible to create a small example with 3 query sequences and 2 times 5 subject_sequences,
so that I can be sure that I understood your request correctly?

So you would like to retain query_ids in a data.frame even if they didn't produce subject hits (encoded by NA lines)?

If yes, I assume simply doing a dplyr::full_join() by query_ids between the initial input query_ids (stored as data.frame) and the result table generated by orthologr::orthologs() is not sufficient enough? If not, could you maybe specify what you had in mind?

I hope this helps and goes in the direction you had in mind?

Cheers,
Hajk

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants