Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in if (sum(idx) == 0) { : missing value where TRUE/FALSE needed #8

Open
zhuangfjnu opened this issue Jan 7, 2018 · 7 comments

Comments

@zhuangfjnu
Copy link

Hi,
When I run the centipede_data , the result show :
Error in if (sum(idx) == 0) { : missing value where TRUE/FALSE needed.
Maybe it is some problem with my bam file?

Thanks
Zhen

@slowkow
Copy link
Owner

slowkow commented Jan 8, 2018

Hi Zhen,

Thanks for opening an issue! If you could share a minimal example of code and subset of your data that reproduces this error, I can try to run it on my laptop and fix the problem.

@slowkow
Copy link
Owner

slowkow commented Mar 16, 2018

Please feel free to reopen this issue if you can provide a minimal example.

@slowkow slowkow closed this as completed Mar 16, 2018
@rdbcasillas
Copy link

Hey slowkow!

I get the same error as Zhen. Here is the snippet of code and the corresponding error:


>cen_P0_rep1 <- centipede_data(bam_file = "call-bowtie2/shard-0/execution/P0_rep1_R1.merged.bam", fimo_file = "centipede-files/P0_rep1.fimo.txt.gz", pval= 1e-4, flank_size = 100)


Parsed with column specification:
cols(
  motif_id = col_character(),
  motif_alt_id = col_character(),
  sequence_name = col_character(),
  start = col_double(),
  stop = col_double(),
  strand = col_character(),
  score = col_double(),
  `p-value` = col_double(),
  `q-value` = col_logical(),
  matched_sequence = col_character()
)

|=========================================================================| 100%  824 MB


Error in if (sum(idx) == 0) { : missing value where TRUE/FALSE needed

The code ran for couple of hours before throwing this error. So seems like something happened near the end of execution. I get the same error even if I remove the parameters "flank size" and/or "pval".

Let me know if I can provide any more information or if you want me to open a new issue.

Thanks!

@slowkow
Copy link
Owner

slowkow commented Jan 8, 2019

Hi Vatsal, thanks for sharing your code and error. That helps.

If your session is still live, it would help if you run traceback() so we can see which line in which file caused the error.

The error might come from this code (line 163):

# The reads overlapping this region.
region <- parse_region(names(bam_list)[i])
region_len <- abs(region$end - region$start) + 1
# Exclude reads with start positions outside the region.
item <- bam_list[[i]]
# take care of negative reads starting at end position
neg_shift <- item$qwidth * as.numeric(item$strand == "-")
item$pos <- item$pos + neg_shift
idx <- item$pos >= region$start & item$pos <= region$end
if (sum(idx) == 0) {
return(rep(0, 2 * region_len))
}

If you can share data and code that I can run on laptop, then I might be able to help.

Otherwise, I would suggest that you try running the code from functions.R by hand -- line by line in the R console -- and try to find out what is going wrong.

Good luck! Let me know if you figure it out!

@slowkow slowkow reopened this Jan 8, 2019
@rdbcasillas
Copy link

Thanks for responding so quickly Kamil!

I did run the traceback command. Here's the output:

> traceback()
4: FUN(X[[i]], ...)
3: lapply(seq(1, length(bam_list)), function(i) {
       region <- parse_region(names(bam_list)[i])
       region_len <- abs(region$end - region$start) + 1
       item <- bam_list[[i]]
       neg_shift <- item$qwidth * as.numeric(item$strand == "-")
       item$pos <- item$pos + neg_shift
       idx <- item$pos >= region$start & item$pos <= region$end
       if (sum(idx) == 0) {
           return(rep(0, 2 * region_len))
       }
       strand <- item$strand[idx]
       position <- item$pos[idx]
       is.neg <- as.numeric(strand == "-")
       j <- 1 + position - region$start + (region_len * is.neg)
       as.numeric(table(factor(j, levels = seq(1, 2 * region_len))))
   })
2: do.call(rbind, lapply(seq(1, length(bam_list)), function(i) {
       region <- parse_region(names(bam_list)[i])
       region_len <- abs(region$end - region$start) + 1
       item <- bam_list[[i]]
       neg_shift <- item$qwidth * as.numeric(item$strand == "-")
       item$pos <- item$pos + neg_shift
       idx <- item$pos >= region$start & item$pos <= region$end
       if (sum(idx) == 0) {
           return(rep(0, 2 * region_len))
       }
       strand <- item$strand[idx]
       position <- item$pos[idx]
       is.neg <- as.numeric(strand == "-")
       j <- 1 + position - region$start + (region_len * is.neg)
       as.numeric(table(factor(j, levels = seq(1, 2 * region_len))))
   }))
1: centipede_data(bam_file = "call-bowtie2/shard-0/execution/P0_rep1_R1.merged.bam", 
       fimo_file = "centipede-files/P0_rep1.fimo.txt.gz", pval = 1e-04, 
       flank_size = 100)

I doubt this is very helpful for you. Since the input files are massive, would you like a trimmed down version of those files? I worry that these smaller files may disallow you to duplicate the exact error.

Otherwise, I would suggest that you try running the code from functions.R by hand -- line by line in the R console -- and try to find out what is going wrong.

I still haven't tried this so will let you know if I find something interesting.

@rdbcasillas
Copy link

So it turns out there was an issue with the BAM file. When I used the "deduped" BAM file, the problem seems to disappear. May be it is important for CENTIPEDE that the BAM file is without duplicates? I am not sure but at the moment that has fixed the issue.

@shokohirosue
Copy link

Hi all,

I have the same problem.
I tried removing duplicates but my data is a paired-end I have not been able to succeed in doing so.
Has anyone looked into this issue? How did you remove duplicates?

Thank you.

Shoko

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants