Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Q: Peaks are too long and peak number is low #85

Open
WJH58 opened this issue Apr 27, 2021 · 3 comments
Open

Q: Peaks are too long and peak number is low #85

WJH58 opened this issue Apr 27, 2021 · 3 comments

Comments

@WJH58
Copy link

WJH58 commented Apr 27, 2021

Describe the problem
Hi thanks for this tool!
I got very low number (~200) of peaks for each sample by hmmratac compared with the number of those peaks (~2000) called by macs2.

What solutions I tried
I tried more stringent -u and -l (e.g. l=25, u=40; l=25, u=50; l=25, u=60; l=25, u=100) however I still only get roughly 200 peaks. When -u exceeds 50, the training set will fail. Therefore, I sticked to l=25, u=40 and l=25, u=50.

This is one of the command line I tried:

HMMRATAC --bedgraph true -Xmx46G -b temp/sort/KO1.sorted.bam -i temp/sort/KO1.sorted.bam.bai -g genome.info         -o KO1 -l 25 -u 40 --window 25000 -z 60  

This is one of my log file:

Arguments Used:
--bedgraph      true
-b      temp/sort/KO1.sorted.bam
-i      temp/sort/KO1.sorted.bam.bai
-g      genome.info
-o      KO1
-l      25
-u      40
--window        25000
Fragment Expectation Maximum Done
Mean    50.0    StdDevs 20.0
Mean    198.1872335887453       StdDevs 35.03023353787695
Mean    355.31477875971484      StdDevs 48.17917744848093
Mean    627.1925932441502       StdDevs 75.85766273729656
ScalingFactor   18.001187
Training Regions found and Zscore regions for exclusion found
Training Fragment Pileup completed
Kmeans Model:
HMM with 3 state(s)

State 0
  Pi: 0.3333333333333333
  Aij: 0.333 0.333 0.333
  Opdf: Multi-variate Gaussian distribution --- Mean: [ 0.016 0.057 0.056 0.005 ]

State 1
  Pi: 0.3333333333333333
  Aij: 0.333 0.333 0.333
  Opdf: Multi-variate Gaussian distribution --- Mean: [ 0.139 0.521 0.513 0.068 ]

State 2
  Pi: 0.3333333333333333
  Aij: 0.333 0.333 0.333
  Opdf: Multi-variate Gaussian distribution --- Mean: [ 0.518 1.908 2.03 0.335 ]

Model created and refined. See KO1.model
Model:
HMM with 3 state(s)

State 0
  Pi: 0.3333333333333333
  Aij: 0.987 0.008 0.005
  Opdf: Multi-variate Gaussian distribution --- Mean: [ 0 0 0 0 ]

State 1
  Pi: 0.3333333333333333
  Aij: 0.036 0.948 0.016
  Opdf: Multi-variate Gaussian distribution --- Mean: [ 0.063 0.262 0.049 0 ]

State 2
  Pi: 0.3333333333333333
  Aij: 0.01 0.007 0.983
  Opdf: Multi-variate Gaussian distribution --- Mean: [ 0.078 0.273 0.371 0.044 ]

I think Baum-Welch algorithm worked because Aij are not 0.333, and signals of state 2 are higher than state 1. However, I only got 197 peaks (including 85 high coverage peaks). Some peaks are very long, consist of many small peaks.
This is one of the peak called by hmmratac:
KO1_peak

Could you please help me solve this problem?

@EvanTarbell
Copy link
Collaborator

This seems like a threshold problem. It might be that for this dataset, the default thresholds for reporting peaks might be too stringent (for both HMMRATAC and MACS). Try lowering the threshold using the -threshold option. The current default is 30, but you might want to try 20 or 10. This should increase the number of peaks called. The model seems fine, as you pointed out, so you wont need to play with -u / -l anymore.

@WJH58
Copy link
Author

WJH58 commented Apr 27, 2021

Thank you for your quick response! I set -q to 10 and I got 335 peaks for KO1 and 281 peaks for KO2. Compared with previous results, these peak number increased a bit but still low. I wonder why is one peak so long?
Screenshot 2021-04-27 at 21 29 02

@SergioRodLla
Copy link

Hi @WJH58,
I wonder if you have gained more insights regarding this. How did you generate the bigwig file you used for IGV? Is this better than using the gappedPeak file that HMMRATAC outputs?
Thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants