Q: Question regarding low peak numbers #76

AntonioAhn · 2020-10-11T21:24:58Z

Thank you for HMMRATAC. I would love to use this tool but currently, I am having issues with low peak numbers (1262 peaks). This is low in comparison to peaks found with MACS2 (83610 peaks). I have paired-end ATACseq data derived from a cancer cell line.

I ran the tool with:
java -jar HMMRATAC.jar -b ATAC.sorted.bam -i ATAC.sorted.bam.bai -g genome.info -o out_HMM

The model.log shows that the open state (state 2) has the highest emission parameters for all 4 signal components, the nucleosome state (state 1) has the second-highest emission for all signal components and the background state (state 0) has the lowest. I was wondering if my model shows something else wrong that is giving me low peak numbers?

I have also run HMMRATAC with a more stringent setting for creating the model: by increasing -u and -l however this does not change the peak numbers. For example, running with -u 40 -l 20 gave me 1254 peaks.

Would you be able to give some advice on what the issue could be and also how to correct the problem?
I have also included a figure showing the fragment size distribution in case this may help.

The model log with default settings shows:

Fragment Expectation Maximum Done
Mean 50.0 StdDevs 20.0
Mean 177.86633552152415 StdDevs 51.64088623246107
Mean 385.458734684104 StdDevs 75.7992181115244
Mean 652.0928097547281 StdDevs 75.62727395159729
ScalingFactor 21.994759
Training Regions found and Zscore regions for exclusion found
Training Fragment Pileup completed
Kmeans Model:
HMM with 3 state(s)

State 0
Pi: 0.3333333333333333
Aij: 0.333 0.333 0.333
Opdf: Multi-variate Gaussian distribution --- Mean: [ 0.021 0.03 0.006 0.011 ]

State 1
Pi: 0.3333333333333333
Aij: 0.333 0.333 0.333
Opdf: Multi-variate Gaussian distribution --- Mean: [ 0.06 0.13 0.203 0.052 ]

State 2
Pi: 0.3333333333333333
Aij: 0.333 0.333 0.333
Opdf: Multi-variate Gaussian distribution --- Mean: [ 0.206 0.381 0.424 0.169 ]

Model created and refined. See test.model
Model:
HMM with 3 state(s)

State 0
Pi: 0.3333333333333333
Aij: 0.968 0.015 0.017
Opdf: Multi-variate Gaussian distribution --- Mean: [ 0 0 0 0 ]

State 1
Pi: 0.3333333333333333
Aij: 0.017 0.955 0.028
Opdf: Multi-variate Gaussian distribution --- Mean: [ 0.024 0.085 0.192 0.059 ]

State 2
Pi: 0.3333333333333333
Aij: 0.022 0.032 0.946
Opdf: Multi-variate Gaussian distribution --- Mean: [ 0.144 0.232 0.187 0.066 ]

The text was updated successfully, but these errors were encountered:

EvanTarbell · 2020-11-30T14:32:35Z

It seems like the issue is with the third signal track. In state 2, it is 0.187 while in state 1 it is 0.192.
I would try even more stringent -u and -l, for instance -u 50 and -l 25.
You can also lower the value for --threshold, from 30 to something smaller to increase the peak number - to say 20 or 10. But you will really want to find a set of -u and -l values that correct that issue with the third signal track. That is the likely culprit

AntonioAhn added the General Question label Oct 11, 2020

AntonioAhn changed the title Q: Q: Question regarding low peak numbers Oct 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Q: Question regarding low peak numbers #76

Q: Question regarding low peak numbers #76

AntonioAhn commented Oct 11, 2020

EvanTarbell commented Nov 30, 2020

Q: Question regarding low peak numbers #76

Q: Question regarding low peak numbers #76

Comments

AntonioAhn commented Oct 11, 2020

EvanTarbell commented Nov 30, 2020