Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for reading alignments in specific region without CRAI index file #311

Merged
merged 3 commits into from
May 17, 2024

Conversation

athos
Copy link
Member

@athos athos commented May 15, 2024

This PR allows the CRAM reader to read alignments from a specified region within a CRAM file without requiring an index file.

CRAM readers can instantly check if each container or slice overlaps with the specified region by examining their headers. This makes it much more efficient to read alignments by skipping non-overlapping containers or slices, rather than performing a linear scan of the entire CRAM file.

This PR also adds a new test file medium_without_index.cram, which is identical to medium.cram except that it does not come with its corresponding index file.

@athos athos self-assigned this May 15, 2024
@athos athos requested review from alumi and a team as code owners May 15, 2024 04:29
@athos athos requested review from niyarin and removed request for a team May 15, 2024 04:29
Copy link

codecov bot commented May 15, 2024

Codecov Report

Attention: Patch coverage is 82.66667% with 13 lines in your changes are missing coverage. Please review.

Project coverage is 88.50%. Comparing base (90227f3) to head (3623b37).

Files Patch % Lines
src/cljam/io/cram/reader.clj 76.78% 11 Missing and 2 partials ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master     #311   +/-   ##
=======================================
  Coverage   88.49%   88.50%           
=======================================
  Files          95       95           
  Lines        8173     8214   +41     
  Branches      506      506           
=======================================
+ Hits         7233     7270   +37     
- Misses        434      438    +4     
  Partials      506      506           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

(is (not (cram/indexed? cram-rdr')))
(are [?region ?count] (= ?count
(count (cram/read-alignments cram-rdr ?region))
(count (cram/read-alignments cram-rdr' ?region)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test fails in my environment.
Maybe we need to reset the position of the mapped buffer before calling next read-alignments

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pointing that out! Yeah, I'm just looking into it right now. I wasn't aware that I didn't run these tests on my local environment at all 😂

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

b1b0ea4 should fix the position issue.
I also added 3623b37 to improve test coverage for the CRAM region read.

Copy link
Member

@alumi alumi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing so quickly! LGTM 👍

Copy link
Contributor

@niyarin niyarin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@niyarin niyarin merged commit 11d39b0 into master May 17, 2024
17 checks passed
@niyarin niyarin deleted the feature/cram-region-read-without-index branch May 17, 2024 00:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants