New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Probabilities greater than 1 for empirical taxon sampling with dnEpisodicBirthDeath #382
Comments
Thanks for the analysis. I am not familiar with this code, but I can see how it could be unstable. You could get an even better idea what is going on by printing the value of the variables when something goes wrong. For example:
|
For the issue that you mentioned, one improvement could be to replace:
with
I think there is a but in lnProbSurvival where it does You would then make the equivalent change for |
You could then replace |
Thanks for the advice, I was optimistic that it might work but it doesn't seem to be enough.
I get the following values:
|
Hmm... so I'm guessing that
If that is true, then we could write something like:
It might then be possible to express Also, I suspect that we need to have What do you think? If that checks out, then I have an idea. If it does not check out, then I have a different idea. |
I tried the trick you suggest but the decomposition only seems to work for |
Drat. I guess that means that the probability of surviving from last_event_time to present_time is not independent of the event of surviving from time 0 to last_event_time. OK, so if that didn't work, one possibility is that you could say that if the difference between p_0_t and p_0_T is really small then it should be truncated to 0, since the difference could be numerical error. So, something like if ( log_F_t > 0.0 )
{
if ( log_F_t < 1.0e-11 )
{
p_0_T = p_0_t;
log_F_t = 0;
}
else
throw RbException("Problem in computing the probability of missing species in BDP.");
} That is obviously kind of a hack. Really, someone like @hoehna should probably take a look at this. |
Yes, I've tried doing something like that, it runs a bit longer but it always ends up with an error at some point. Sometimes |
looking at the code, it seems we add |
Wow, that's unexpected but it seems to work, thanks Joëlle!
|
I guess that is correct, although it was took a while to see that. I guess But I'm not sure that this is right:
The original criterion throws an error if
Probably someone should read this paper to figure out what is going on here? It would be helpful to know the meaning of |
The formula appears in this paper: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0084184&type=printable (eq:6), and is briefly explained at the end of this tutorial: https://revbayes.github.io/tutorials/divrate/sampling.html. The quantity Nevertheless, I agree that this solution risks masking an error/approximation in the calculation of |
Describe the bug
The code for empirical taxon sampling (clade-specific rho) with
dnEpisodicBirthDeath
throws an unexpected exception: "Problem in computing the probability of missing species in BDP."To Reproduce
From the tutorial Diversification Rate Estimation with Missing Taxa, download mcmc_EBD.Rev and primates_tree.nex.
Then, follow the tutorial instruction in the "Empirical Taxon Sampling" section to modify
mcmc_EBD.Rev
(btw there is a typo, "Varecia_variegata_variegata" should be "Varecia_variegata"), or directly use mcmc_EBD_CSrho.Rev.txt.Run
mcmc_EBD_CSrho.Rev
.Expected behavior
Running the MCMC.
Computer info
MacOS Ventura 13.5.2 (Apple M1 Pro)
RevBayes version (1.2.1), and I also tried with the
developer
branch.Additional context
The error arises in the
computeLnProbabilityTimes
function (inBirthDeathProcess.cpp
).I've observed that sometimes pSurvival values can be very low (down to e-300) and exp(rateIntegral) very high (up to exp(600)), so I suppose this can lead to numerical instabilities when p_0_T and p_0_t are expected to be very close.
The calculations are based on "Inferring Speciation and Extinction Rates under Different Sampling Schemes" and Likelihood Inference of Non-Constant Diversification Rates with Incomplete Taxon Sampling (@hoehna).
The text was updated successfully, but these errors were encountered: