Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make FASTA Index methods and structs public #421

Open
mbhall88 opened this issue Apr 21, 2021 · 3 comments
Open

Make FASTA Index methods and structs public #421

mbhall88 opened this issue Apr 21, 2021 · 3 comments

Comments

@mbhall88
Copy link
Member

I'm not sure if it is intentional or not, but fasta::IndexRecord is currently private. Access to this struct would make working with FASTA indices much nicer. Additionally, there are some methods, such as idx which are also private.

Are there any methods/structs where a conscious decision was made to keep them private?

@dlaehnemann
Copy link
Member

I did a quick git blame on the linked struct and funtion above, and wow is the respective file a collaborative work. 20 contributors...

For the respective section, I saw contributions by @johanneskoester, @HenningTimm and @MikkelSchubert. Maybe one of them has any indicators about them being private---whether that was intentional or not?

@HenningTimm
Copy link
Contributor

I vaguely remember working on that ;) IIRC I wanted to preserve the existing visibility structure since I was relatively new to Rust and didn't want to make stupid mistakes. I also remember that making it public would have my changes back than easier. I would not be opposed making it public.

@mbhall88
Copy link
Member Author

I guess what I was trying to achieve when I raised this was a way of getting the length of a contig in an index - if it is in the index. My current solution is a bit more convoluted than is necessary given this information is accessible by some private functions/structs.

Something which makes me slightly hesitant to just make the private stuff public is the method I mentioned in the beginning, idx
is defined on IndexedReader. Is this the correct interface to have this method accessible on? I suspect this is the data structure users will generally be working with, but I don't want to make assumptions. But also, just because it might be the more commonly used DS doesn't mean it is the most appropriate. As I think about it, have a "reader" object implement core index operations may seems a little confusing.

Ok, here is an example. We have an index

CHROMOSOME_I	1009800	14	50	51
CHROMOSOME_II	5000	1030025	50	51
CHROMOSOME_III	5000	1035141	50	51

With the current, public-facing API, if I want to check for the presence of a contig in this index, and get its length if it is present, I have to do something like this

let faidx = fasta::IndexedReader::from_file(file);
let contig = "CHROMOSOME_I";
let mut contig_len: Option<usize> = None;

for seq in faidx.index.sequences() {
    if seq.name == contig {
        contig_len = Some(seq.len);
        break;
    }
}
match contig_len {
// do something with the Some value...
}

This seems like a lot of work for a fairly basic requirement of an index I think? So, if we made idx public, and `IndexRecord``, we could do something like

let contig_len = match faidx.idx(contig) {
    Ok(idx_record) => Some(idx_record.len),
    Err(_) => None

Another alternative would be to leave the visibility as is and implement a method, either on Index or IndexedReader, which allows something like

// implemented on IndexedReader
let contig_len: Option<usize> = faidx.length_of(contig);
// implemented on Index
let contig_len: Option<usize> = faidx.index.length_of(contig);

I guess one benefit to make the IndexRecord public is that others in the future might find it useful to access internal members such as offset, line bytes etc. But it would probably be best to put these behind getter methods to prevent anyone from mutating the index record.

Apologies for the waffling, but it seems this is an important design decision

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants