Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors with chromosome_matrix() and load_contacts() function~ #351

Open
Datacond opened this issue Apr 9, 2024 · 2 comments
Open

Errors with chromosome_matrix() and load_contacts() function~ #351

Datacond opened this issue Apr 9, 2024 · 2 comments

Comments

@Datacond
Copy link

Datacond commented Apr 9, 2024

Hi, @teunbrand

I am so sorry to bother you. For GENOVA, I'm not a veteran. I would prefer to use GENOVA because of the lightness and color palette. But just recently, I've tried almost everything to walk through the following, but there's always some error that I can't fix. The following errors are all based on GENOVA v1.0.0 in window10 system.

This species has no information about the centromeres. These errors also don't seem to have been found in the previous issue track.

  1. Appears when running test data using chromosome_matrix() function: Error in [.data.frame(idx, , list(N = .N), by = "V1") : unused argument (by = "V1")
library(GENOVA)
exp <- readRDS("test_150k.rds")
head(exp$MAT,4)
     V1    V2         V3
1 18619 18619 20734.3955
2 18619 18620  8296.8487
3 18619 18621   956.5607
4 18619 18622   124.3649
head(exp$IDX,4)
     V1     V2     V3    V4
1 chr21      0 150000 18557
2 chr21 150000 300000 18558
3 chr21 300000 450000 18559
4 chr21 450000 600000 18560

res <- chromosome_matrix(exp)
Error in `[.data.frame`(idx, , list(N = .N), by = "V1") : 
  unused argument (by = "V1")
  1. Appears when running own data using chromosome_matrix() function: Error in seq.default(.limits[1], .limits[2], length.out = guide$nbin) : 'from' must be a finite number
WT <- load_contacts(
  signal_path = 'WT_1000000_iced.matrix', 
  indices_path = 'WT_1000000_abs.bed',  
  sample_name = "WT", 
  colour = "black"
)
Reading data...

MUT <- load_contacts(
  signal_path = 'MUT_1000000_iced.matrix', 
  indices_path = 'MUT_1000000_abs.bed', 
  sample_name = "MUT", 
  colour = "red",
  # centromeres = FALSE
)
Reading data...

> chr_mat <- chromosome_matrix(list(WT,MUT))
> visualise(chr_mat)
Error in seq.default(.limits[1], .limits[2], length.out = guide$nbin) : 
  'from' must be a finite number
  1. Appears when running own data using load_contacts() function with centromeres = FALSE settings: Error in names(object) <- nm : 'names' attribute [3] must be the same length as the vector [1]
MUT <- load_contacts(
  signal_path = 'MUT_1000000_iced.matrix', 
  indices_path = 'MUT_1000000_abs.bed', 
  sample_name = "MUT", 
  colour = "red",
  centromeres = FALSE
)
  1. Appears when running own data using load_contacts() function: noscf denotes the removal of records from the sparse matrix generated by HiCPro that are aligned to those on scaffolds.
WT <- load_contacts(
  signal_path = 'WT_iced.noscf.matrix', 
  indices_path = 'WT_abs.noscf.bed', 
  sample_name = "WT", 
  # colour = "red",
  # centromeres = FALSE
)
Reading data...
Error in setkeyv(x, cols, verbose = verbose, physical = physical) : 
  Some columns not in data.table  chrom

WT <- load_contacts(
  signal_path = 'WT_iced.noscf.matrix', 
  indices_path = 'WT_abs.noscf.bed', 
  sample_name = "WT", 
  # colour = "red",
  centromeres = FALSE
)
Reading data...
Error in names(object) <- nm : 'names' attribute [3] must be the same length as the vector [1]

I'm looking forward to your guidance. Thanks.

Best,
Yao

@teunbrand
Copy link
Collaborator

Hi Yao,

I'm afraid I'm unable to reproduce the issue. What do I need to change in the example below to run into the error you get?

library(GENOVA)
exp <- get_test_data("150k")
cm <- chromosome_matrix(exp)
visualise(cm)

Created on 2024-04-11 with reprex v2.1.0

@Datacond
Copy link
Author

Hi, @teunbrand

Same code, but I'm still getting the same error. I have carefully understood the chromosome_matrix.R script regarding the calculation of interchromosomal interactions enrichment. There is a question that is not quite related to this software that I would love to have your insights. I was trying to compare it to the results obtained by GENOVA.

Proximity of Chromosome Territories.
The expected number of interchromosomal interactions for each chromosome pair i,j was computed by multiplying the fraction of interchromosomal reads containing i with the fraction of interchromosomal reads containing j and multiplying by the total number of interchromosomal reads. The enrichment was computed by taking the actual number of interactions observed between i and j and dividing it by the expected value.

This is the method used to calculate interchromosomal interaction enrichment in the 2009 dilution HiC article (Lieberman-Aiden et al, 2009). How to understand fraction of interchromosomal reads containing i(j)?A example:

        chr1     chr2   chr3
chr1     200      20     10
chr2     20      300     5
chr3     10       5     100 

Which interpretation is fraction of interchromosomal reads containing chr1?
(20+10)/(20+10+5) : Probability of inter contact count on chr1
(20+10) : Inter contact count on chr1
(20+10)/(20+10+5+20+10+5) : Consider the inter contact counts of the upper and lower triangles

And the total number of interchromosomal reads : whether to take into account the sum of the inter contact counts of the upper and lower triangles, or the upper/lower triangles alone.

Best.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants