Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

snippers: order of "select" and "snip" matters, but should it #502

Open
sergpolly opened this issue Feb 18, 2024 · 0 comments
Open

snippers: order of "select" and "snip" matters, but should it #502

sergpolly opened this issue Feb 18, 2024 · 0 comments

Comments

@sergpolly
Copy link
Member

Here is a simple example, consider simple CoolerSnipper:

snipper = CoolerSnipper(
    clr,
    view_df=view_df,
    cooler_opts={"balance": "weight"},
    min_diag=2,
)

where view_df is

chrom	start	end	name
chr1	100000000	150000000	foo
chr2	100000000	150000000	bar

inputs are taken from cooltools/tests.
Now, if we select/snip in the "right" order:

# select and snip foo
foo_mat = snipper.select("foo","foo")
foo_snip = snipper.snip(foo_mat, "foo", "foo", (100000000, 107000000, 120000000, 127000000) )
# select and snip bar
bar_mat = snipper.select("bar","bar")
bar_snip = snipper.snip(bar_mat, "bar", "bar", (100000000, 107000000, 120000000, 127000000) )

results look like this:
download-1

when we change the order of snip/select, like so:

# select foo and bar:
foo_mat = snipper.select("foo","foo")
bar_mat = snipper.select("bar","bar")
# snip foo and bar:
foo_snip = snipper.snip(foo_mat, "foo", "foo", (100000000, 107000000, 120000000, 127000000) )
bar_snip = snipper.snip(bar_mat, "bar", "bar", (100000000, 107000000, 120000000, 127000000) )

result would look like so:
download-2

Note that white bar on the foo_snip - it is because snipper.select("bar","bar") modified some instance attributes - _isnan1, _isnan2 etc, so now those attributes from "bar", modified them for "foo" as well ... We're lucky in this case it didn't crash because dimensions of "foo" and "bar" are identical.

Anyhow, this is not a big deal right now for the way snippers are used (they're used in the right order, even in multiprocessing scenario, i hope), but this is potentially confusing and I wanted to document this is a fact of life. potentially - would be nice to decouple selecting and snipping if others agree - also this could be added to such a refactoring wishlist #227

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant