Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"IndexError: list index out of range" while using --pcatype fithic #64

Open
Lucas446 opened this issue Mar 31, 2023 · 11 comments
Open

"IndexError: list index out of range" while using --pcatype fithic #64

Lucas446 opened this issue Mar 31, 2023 · 11 comments
Labels
possible bug Further information is requested about possible bug.

Comments

@Lucas446
Copy link

Hi,

I am running into an error while running dcHIC fithic step:

`Rscript /Users/tlucas/dcHiC/dchicf.r --file input.St10_St14.txt --pcatype fithic --dirovwt T --diffdir catSt10_vs_catSt14 --maxd 10e6 --fithicpath '/Users/tlucas/.pyenv/shims/fithic' --pythonpath '/Users/tlucas/.pyenv/shims/python3'
Finding significant loops from intra sample  St14 St10  replicates
[1] "folder exists"
Fithic file already exists for  NB_St14_20Kb , skipping
[1] "folder exists"
Fithic file already exists for  NB_St10_20Kb , skipping
fithic -i DifferentialResult/catSt10_vs_catSt14/fithic_run/NB_St14_20Kb_fithic/interactions.txt.gz -f DifferentialResult/catSt10_vs_catSt14/fithic_run/NB_St14_20Kb_fithic/fragments.txt.gz -t /Users/tlucas/dcHiC/RESULTS/biases/NB_St14_20Kb.biases.gz -U 10000000 -o DifferentialResult/catSt10_vs_catSt14/fithic_run/NB_St14_20Kb_fithic/fithic_result -r 20000 


GIVEN FIT-HI-C ARGUMENTS
=========================
Reading fragments file from: DifferentialResult/catSt10_vs_catSt14/fithic_run/NB_St14_20Kb_fithic/fragments.txt.gz
Reading interactions file from: DifferentialResult/catSt10_vs_catSt14/fithic_run/NB_St14_20Kb_fithic/interactions.txt.gz
Output path being used from DifferentialResult/catSt10_vs_catSt14/fithic_run/NB_St14_20Kb_fithic/fithic_result
Fixed size option detected... Fast version of FitHiC will be used
Resolution is 20.0 kb
Reading bias file from: /Users/tlucas/dcHiC/RESULTS/biases/NB_St14_20Kb.biases.gz
The number of spline passes is 1
The number of bins is 100
The number of reads required to consider an interaction is 1
The name of the library for outputted files will be FitHiC
Upper Distance threshold is 10000000
Lower Distance threshold is 0
Only intra-chromosomal regions will be analyzed
Lower bound of bias values is 0.5
Upper bound of bias values is 2
All arguments processed. Running FitHiC now...
=========================


Reading the contact counts file to generate bins...
Interactions file read. Time took 2.4002461433410645
Fragments file read. Time took 0.014471769332885742
Traceback (most recent call last):
  File "/Users/tlucas/.pyenv/versions/3.7.3/bin/fithic", line 11, in <module>
    load_entry_point('fithic==2.0.8', 'console_scripts', 'fithic')()
  File "/Users/tlucas/.pyenv/versions/3.7.3/lib/python3.7/site-packages/fithic/fithic.py", line 327, in main
    biasDic = read_biases(biasFile)
  File "/Users/tlucas/.pyenv/versions/3.7.3/lib/python3.7/site-packages/fithic/fithic.py", line 808, in read_biases
    chrom=words[0]; midPoint=int(words[1]); bias=float(words[2])
IndexError: list index out of range
fithic -i DifferentialResult/catSt10_vs_catSt14/fithic_run/NB_St10_20Kb_fithic/interactions.txt.gz -f DifferentialResult/catSt10_vs_catSt14/fithic_run/NB_St10_20Kb_fithic/fragments.txt.gz -t /Users/tlucas/dcHiC/RESULTS/biases/NB_St10_20Kb.biases.gz -U 10000000 -o DifferentialResult/catSt10_vs_catSt14/fithic_run/NB_St10_20Kb_fithic/fithic_result -r 20000 

` 

At some point the index of a list it is trying to access is out of range, do you have any idea where this would be coming from ?

Thanks a lot,
Best,

@ay-lab
Copy link
Owner

ay-lab commented Mar 31, 2023

It seems like there is an issue with the bias file NB_St14_20Kb.biases.gz.
Just want to check if there are any unusual instances in the file, can you share the file with us?

@Lucas446
Copy link
Author

Lucas446 commented Apr 4, 2023

Does are the my biases file I use:

NB_St10_20Kb.biases.gz
NB_St14_20Kb.biases.gz

Thank you

@ay-lab
Copy link
Owner

ay-lab commented Apr 5, 2023

After looking at the issue more carefully, I found that the error is related to fithic.
Please have a look at this issue from fithic repository and try to implement the solution or at least let me know if you're having the empty lines
ay-lab/fithic#54

@Lucas446
Copy link
Author

Lucas446 commented Apr 7, 2023

Hi,

I looked at the ay-lab/fithic/issues/54 and it think my biases file doesn't have the right format

Format from ay-lab/fithic/issues/54 :

zcat fat_5000.fithic.bias.gz|head
NC_052532.1	2500	0.447834
NC_052532.1	7500	0.098977
NC_052532.1	12500	0.150248
NC_052532.1	17500	0.374007
NC_052532.1	22500	0.563625

My biases format (from HiC-Pro ICE output):

gzcat /Users/tlucas/dcHiC/RESULTS/biases/NB_St10_20Kb.biases.gz | head
7.779782869173684778e-01
1.727070717027215929e+00
9.843715269101076526e-01
1.428104165677709148e+00
1.546911243345023834e+00

Do I have to convert the HiC-pro biases format to a fithic format using hicpro2fithic.py ?

Thanks! :)

@ay-lab
Copy link
Owner

ay-lab commented Apr 7, 2023

Yes!

@Lucas446
Copy link
Author

Ok thanks, I have error when using hicpro2fithic.py, I will posted them in hicpro github

@ay-lab
Copy link
Owner

ay-lab commented Apr 10, 2023

please post here as well. we developed that script.

@ay-lab
Copy link
Owner

ay-lab commented Apr 10, 2023

Please post to the fithic github repository actually.

@Lucas446
Copy link
Author

Here is my fithic error:

python3 fithic -i DifferentialResult/diff_analysis/fithic_run/NB_St14_r1_20Kb_fithic/interactions.txt.gz -f DifferentialResult/diff_analysis/fithic_run/NB_St14_r1_20Kb_fithic/fragments.txt.gz -t /Users/tlucas/dcHiC/RESULTS/biases/NB_St14_r1_20Kb.biases.gz -U 10000000 -o DifferentialResult/diff_analysis/fithic_run/NB_St14_r1_20Kb_fithic/fithic_result -r 20000 
  File "/Users/tlucas/.pyenv/shims/fithic", line 3
    [ -n "$PYENV_DEBUG" ] && set -x
                      ^
SyntaxError: invalid syntax

inputs:

biase

dyn-129-236-163-31:RESULTS tlucas$ gzcat /Users/tlucas/dcHiC/RESULTS/biases/NB_St14_r1_20Kb.biases.gz | head
2L	10000	0.7075591964989174

2L	30000	1.2062730976455918

2L	50000	0.9499451013956518

2L	70000	1.216953376795569

2L	90000	1.3021749178375797


fragments

dyn-129-236-163-31:RESULTS tlucas$ gzcat /Users/tlucas/dcHiC/RESULTS/DifferentialResult/diff_analysis/fithic_run/NB_St14_r1_20Kb_fithic/fragments.txt.gz | head
2L	0	10000	371	1
2L	0	30000	734	1
2L	0	50000	629	1
2L	0	70000	813	1
2L	0	90000	941	1
2L	0	110000	888	1
2L	0	130000	807	1
2L	0	150000	847	1
2L	0	170000	624	1
2L	0	190000	663	1

interactions

dyn-129-236-163-31:RESULTS tlucas$ gzcat /Users/tlucas/dcHiC/RESULTS/DifferentialResult/diff_analysis/fithic_run/NB_St14_r1_20Kb_fithic/interactions.txt.gz | head
2L	10000	2L	10000	48
2L	10000	2L	30000	68
2L	10000	2L	50000	21
2L	10000	2L	70000	17
2L	10000	2L	90000	17
2L	10000	2L	110000	8
2L	10000	2L	130000	7
2L	10000	2L	150000	5
2L	10000	2L	170000	2
2L	10000	2L	190000	2

Thanks a lot!

@Lucas446
Copy link
Author

Ok I managed to fix the syntax error replacing in dcHIC.r script line 1495 python3 path by "bash"

previous:
cmd <- paste0(python_path," ",fithic_path," -i ",folder,"/interactions.txt.gz -f ",folder,"/fragments.txt.gz -t ",bias," -U ",as.integer(u)," -o ",folder,"/fithic_result -r ",as.integer(resolution))

fix:
cmd <- paste0("bash"," ",fithic_path," -i ",folder,"/interactions.txt.gz -f ",folder,"/fragments.txt.gz -t ",bias," -U ",as.integer(u)," -o ",folder,"/fithic_result -r ",as.integer(resolution))

Now I am still have the out of range issue even using the output of hicpro2fithic

bash /Users/tlucas/.pyenv/shims/fithic -i DifferentialResult/diff_analysis/fithic_run/NB_esc_20Kb_fithic/interactions.txt.gz -f DifferentialResult/diff_analysis/fithic_run/NB_esc_20Kb_fithic/fragments.txt.gz -t /Users/tlucas/dcHiC/RESULTS/biases/NB_esc_20Kb.biases.gz -U 10000000 -o DifferentialResult/diff_analysis/fithic_run/NB_esc_20Kb_fithic/fithic_result -r 20000 


GIVEN FIT-HI-C ARGUMENTS
=========================
Reading fragments file from: DifferentialResult/diff_analysis/fithic_run/NB_esc_20Kb_fithic/fragments.txt.gz
Reading interactions file from: DifferentialResult/diff_analysis/fithic_run/NB_esc_20Kb_fithic/interactions.txt.gz
Output path created DifferentialResult/diff_analysis/fithic_run/NB_esc_20Kb_fithic/fithic_result
Fixed size option detected... Fast version of FitHiC will be used
Resolution is 20.0 kb
Reading bias file from: /Users/tlucas/dcHiC/RESULTS/biases/NB_esc_20Kb.biases.gz
The number of spline passes is 1
The number of bins is 100
The number of reads required to consider an interaction is 1
The name of the library for outputted files will be FitHiC
Upper Distance threshold is 10000000
Lower Distance threshold is 0
Only intra-chromosomal regions will be analyzed
Lower bound of bias values is 0.5
Upper bound of bias values is 2
All arguments processed. Running FitHiC now...
=========================


Reading the contact counts file to generate bins...
Interactions file read. Time took 4.374690055847168
Fragments file read. Time took 0.01379704475402832
Traceback (most recent call last):
  File "/Users/tlucas/.pyenv/versions/3.7.3/bin/fithic", line 11, in <module>
    load_entry_point('fithic==2.0.8', 'console_scripts', 'fithic')()
  File "/Users/tlucas/.pyenv/versions/3.7.3/lib/python3.7/site-packages/fithic/fithic.py", line 327, in main
    biasDic = read_biases(biasFile)
  File "/Users/tlucas/.pyenv/versions/3.7.3/lib/python3.7/site-packages/fithic/fithic.py", line 808, in read_biases
    chrom=words[0]; midPoint=int(words[1]); bias=float(words[2])
IndexError: list index out of range
[1] 1
Taking input= as a system command ('gzip -dc DifferentialResult/diff_analysis/fithic_run/NB_St14_r1_20Kb_fithic/fithic_result/FitHiC.spline_pass1.res20000.significances.txt.gz') and a variable has been used in the expression passed to `input=`. Please use fread(cmd=...). There is a security concern if you are creating an app, and the app could have a malicious user, and the app is not running in a secure environment; e.g. the app is running as root. Please read item 5 in the NEWS file for v1.11.6 for more information and for the option to suppress this message.
gzip: can't stat: DifferentialResult/diff_analysis/fithic_run/NB_St14_r1_20Kb_fithic/fithic_result/FitHiC.spline_pass1.res20000.significances.txt.gz (DifferentialResult/diff_analysis/fithic_run/NB_St14_r1_20Kb_fithic/fithic_result/FitHiC.spline_pass1.res20000.significances.txt.gz.gz): No such file or directory
Error in setnames(x, value) : 
  Can't assign 7 names to a 0 column data.table
Calls: fithicformat ... colnames<- -> names<- -> names<-.data.table -> setnames
In addition: Warning message:
In data.table::fread(paste0("gzip -dc ", diffdir, "/fithic_run/",  :
  File '/var/folders/0h/h1zqy6251n1dq_nw_050nmdc0000gn/T//RtmpuHmiEu/filee5fa48801a2c' has size 0. Returning a NULL data.table.
Execution halted

@abhijitcbio
Copy link

I see you posted this issue in the fithic repository too.
I will wait for their comments.

@ay-lab ay-lab added the possible bug Further information is requested about possible bug. label Aug 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
possible bug Further information is requested about possible bug.
Projects
None yet
Development

No branches or pull requests

3 participants