-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add frequency to PomBase gene to phenotype transform #647
Comments
I don't think it is worth you including the fission yeast penetrance and specificity extensions in Monarch. These are probably only really useful to fission yeast researchers working on these genes. I misunderstood what the frequencies referred to. I thought that multiple annotations to the same phenotype were going to be collapsed and a "frequency" assigned like a "tally". For instance in cdc2 there are 387 phenotypes, but many of the annotations are identical (from different sources) Is the "frequency" column intended to represent the frequency in a population? If so, it might be better to call it penetrance to be unambiguous? The extensions in column 17 with the "assayed using" qualifier might be more useful because these link to the other gene entities that the mutant affects (making connections between other entities in the knowledge graph). a biological might be that gene A when mutated affects the localization, or transcript level, or modification of gene B. These could be useful for networks because >70% fission yeast genes have human orthologs. I once sent an e-mail describing the aspects of fission yeast phenotypes data that would be most useful for informing human biology and hence for display in Monarch. I will see if I can find it. I'm happy to meet up and discuss what might be most useful for Monarch with you @cmungall @monicacecilia Sorry, this ticket is now about multiple things! |
Anyway, if you do decide to use penetrance these 3 will be fixed in tomorrow's export file high,20 (fixed to 20 (%) |
Column 15 in the phaf format is described as:
(the numbers preceding values below are counts)
The mapping to FYPO_EXT looks fairly clear here for these qualifier names:
Less clear for these:
The FYPO_EXT definitions themselves don't give frequency ranges. For HPO frequency qualifiers, our sorting function takes the low value of the defined ranges, I'm not sure how I would map these to numeric values for sorting.
There are numerical ranges defined as well, some examples:
For consistancy with HPO range qualifier behavior, I assume these would sort on the low value.
For sorting approximate frequencies, I would probably just strip the
~
and continue sorting on the low value(~7580 looks like it's meant to be ~75-80?)
Finally there are greater than and less than. I assume for the sake of sorting, we would just want to strip the
>
or<
and alter the value slightly so that ">80" would sort above "80".cc:@ValWood
The text was updated successfully, but these errors were encountered: