Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fanc load for customized .hic file #167

Open
caragraduate opened this issue Sep 11, 2023 · 3 comments
Open

fanc load for customized .hic file #167

caragraduate opened this issue Sep 11, 2023 · 3 comments

Comments

@caragraduate
Copy link

Hi there,

I recently met a problem and am not sure how to deal with this in fan-c. I have successfully used pre to generate a .hic file from the text that I received from another normalization package following the discussion here: https://groups.google.com/g/3d-genomics/c/COQrPJtZXk8/m/Ujn2SOxoCQAJ. I would like to use fan-c to calculate the insulation scores on the new normalized .hic matrix I transformed from pre. I know originally I could use test.hic@5kb@KR to specify the resolution and normalization vector in my .hic file. However, considering my new .hic file does not have any KR or other built-in normalization vector, since it is transformed from the five column text file (the link I shared above). I am not sure how to specify this column of new normalized vectors in my loaded hic object since there is no specific column name for the new .hic file when doing this format conversion in pre. In brief, I wonder what should I put in the space of "?" in the line of code: hic_test = fanc.load("test_chr1_5kb_norm_30.hic@5kb@?")

I hope I made my question clear and thank you for your time and insights on this!

@kaukrise
Copy link
Collaborator

Hi, can you open a Python console and try this? You only need to change the file path (/path/to/file.hic), everything else should be copy-paste. The code should print a list of available normalisations.

import struct
import fanc
from fanc.compatibility.juicer import JuicerHic, _read_cstr

hic = fanc.load('/path/to/file.hic')

with open(hic._hic_file, 'rb') as req:
    version = JuicerHic._version(req)
    JuicerHic._skip_to_normalisation_vectors(req)
    n_entries_packed = req.read(4)
    n_entries = struct.unpack('<i', n_entries_packed)[0] if len(n_entries_packed) != 0 else 0
    existing_normalisations = set()
    for _ in range(n_entries):
        entry_normalisation = _read_cstr(req)
        entry_chromosome_index = struct.unpack('<i', req.read(4))[0]
        entry_unit = _read_cstr(req)
        entry_resolution = struct.unpack('<i', req.read(4))[0]
        file_position = struct.unpack('<q', req.read(8))[0]
        req.read(8 if version > 8 else 4)  # skip size in bytes
        existing_normalisations.add(entry_normalisation)
print(existing_normalisations)

@caragraduate
Copy link
Author

Thank you! This worked with the .hic file that has additional normalizations (KR, VC, etc), but I wonder how to calculate the insulation scores on the raw .hic map (normalization with NONE). This is because I used Pre to create the customized .hic file and actually, the IF value in the input text file is already normalized, which means that the converted raw .hic map should already be the normalized one. I am not interested in using any KR, VC, ICE, or other default normalizations in the Juicer but actually would like to calculate the insulation and boundaries on my own normalized .hic file.

@kaukrise
Copy link
Collaborator

I am not sure I understand correctly, but have you tried with @NONE? That should do all calculations without applying the bias vectors. Since in your case these seem to be already part of the stored contact values, I think that is what you want.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants