erro twisst_data_smooth #12

Wei-Gao-CAS · 2020-04-28T02:23:20Z

twisst_data_smooth <- smooth.twisst(twisst_data, span_bp = 1000000, spacing = 50000)
Error in seq.default(twisst_object$pos[[i]][1], tail(twisst_object$pos[[i]], :
wrong sign in 'by' argument
Calls: smooth.twisst -> seq -> seq.default

simonhmartin · 2020-05-06T13:47:46Z

I have not encountered this error before. I could look into it if you share the weights file.

giyany · 2022-01-04T13:08:31Z

Hi Simon,
I experience this error too, and have not been able to plot some of the scaffolds in my data-set. Others plot fine.

I'm attaching weights and the data: Scaffold_15 plots fine, Scaffold_2199 - does not.

Thanks
[output.run3.weights.csv](https://github.com/simonhmartin/twisst/files/7807756/outp
output.run4.data.tsv.gz
ut.run3.weights.csv)

simonhmartin · 2022-01-06T13:30:30Z

Hi giyany,
The problem appears to be that your window start and end positions do not increase consistently in the data file:

scaffold        start       end
Scaffold_1174   1000000     1010000
Scaffold_1174   100000      110000
Scaffold_1174   10040000    10050000

The R script is expecting each window to have a larger start and end position to the one before it. I'm not sure how this happened in your data, but you will need to correct the input files before using plot_twisst.R. If you think the results are correct and simply unordered, you could probably reorder the files quite easily with R by first inferring the correct order using the order function and then using this correct order to reorder the rows in each file.

Simon

simonhmartin · 2022-01-06T13:40:03Z

Actually I just realised that I've already implemented a fix for this. You can include reorder_by_start=TRUE in the import.twisst command and it will correct the data.

giyany · 2022-01-06T13:58:45Z

Thanks, that makes a lot of sense. The issue was, when using reorder_by_start=TRUE, I got this error:

Error in predLoess(object$y, object$x, newx = if (is.null(newdata)) object$x else if (is.data.frame(newdata)) as.matrix(model.frame(delete.response(terms(object)), :
NA/NaN/Inf in foreign function call (arg 5)

in addition, the result of plot.twisst also seemed off, although it's not very clear to say: so I assumed reorder_by_start may not be doing what I expected.

Now I sort the data beforehand: the reason it was sorted that way was that I used the order of files as they appeared on command line to pull out the coordinates. There are probably better ways to do it, but apparently I'm not the only one.

simonhmartin · 2022-01-06T14:05:12Z

I see. I will look into this. Perhaps there is still a bug in how I am doing the reordering.

simonhmartin · 2022-01-06T14:06:35Z

Would you please attach the weights file (output.run3.weights.csv) again? I couldn't download it for some reason.

giyany · 2022-01-06T14:08:23Z

Happily:

output.run3.weights.csv.gz

giyany · 2022-01-06T15:13:14Z

Another note: it seems to be a function of span_bp, maybe the data is simply too skewed/too many NA values for the span, not related to the order.

simonhmartin · 2022-01-07T07:59:16Z

I couldn't recreate your error, but I got different errors due to the files having different numbers of lines (possibly because the weights are from run3 and the window data from run4?).
Anyway I think you're right that the span needs to be set much larger for your data - probably at least 10 times the window size. You've used broad windows of 10kb for your trees. This is not recommended for most organisms, because in most species the span of distinct genealogies across the genome will be much less than 10kb. I think this is reflected in the fact that your data are strongly skewed toward one topology. In the original paper, we showed that that would happen if the tree spans are too large.

giyany · 2022-01-07T09:57:56Z

output.run4.weights.csv

Yes, I attached the wrong file - sorry about that.
If you still want to look, here is the correct file.

Thanks a lot for this useful input, I'll re-do this considering just SNP numbers like the paper recommends.

simonhmartin · 2022-01-07T10:50:53Z

Great. Yes seems to work find with run4 after increasing the smoothing span.

giyany linked a pull request Jan 5, 2022 that will close this issue

take min/max values instead of first/last of seq function #27

Open

simonhmartin closed this as completed Jan 6, 2022

simonhmartin reopened this Jan 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

erro twisst_data_smooth #12

erro twisst_data_smooth #12

Wei-Gao-CAS commented Apr 28, 2020

simonhmartin commented May 6, 2020

giyany commented Jan 4, 2022 •

edited

simonhmartin commented Jan 6, 2022 •

edited

simonhmartin commented Jan 6, 2022

giyany commented Jan 6, 2022

simonhmartin commented Jan 6, 2022

simonhmartin commented Jan 6, 2022

giyany commented Jan 6, 2022

giyany commented Jan 6, 2022

simonhmartin commented Jan 7, 2022

giyany commented Jan 7, 2022

simonhmartin commented Jan 7, 2022

erro twisst_data_smooth #12

erro twisst_data_smooth #12

Comments

Wei-Gao-CAS commented Apr 28, 2020

simonhmartin commented May 6, 2020

giyany commented Jan 4, 2022 • edited

simonhmartin commented Jan 6, 2022 • edited

simonhmartin commented Jan 6, 2022

giyany commented Jan 6, 2022

simonhmartin commented Jan 6, 2022

simonhmartin commented Jan 6, 2022

giyany commented Jan 6, 2022

giyany commented Jan 6, 2022

simonhmartin commented Jan 7, 2022

giyany commented Jan 7, 2022

simonhmartin commented Jan 7, 2022

giyany commented Jan 4, 2022 •

edited

simonhmartin commented Jan 6, 2022 •

edited