Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Memory leak when using faidx::Reader::fetch_seq() ? #401

Open
MaelLefeuvre opened this issue Jul 20, 2023 · 10 comments
Open

[bug] Memory leak when using faidx::Reader::fetch_seq() ? #401

MaelLefeuvre opened this issue Jul 20, 2023 · 10 comments

Comments

@MaelLefeuvre
Copy link

MaelLefeuvre commented Jul 20, 2023

Bug description

Hello,

I'm currently working on a simple rust utility that requires me to read and compare bam records with its matching reference sequence.

My approach ended up using the rust_htslib::faidx::Reader::fetch_seq(). However, i'm now noticing my binary is experiencing quite the serious memory leak (see screen captures below)

Profiling the binary through heaptrack, clearly points out to the ffi call to htslib::faidx_fetch_seq64() within the faidx::Reader::fetch_seq() method (See src/faidx/mod.rs:73-79)

Timeline Summary Bottom-up stacktrace
image image image

Minimally reproducible example

A simple loop such as the one below is enough to create a steady increase of the binary's resident set size, while I would expect the memory usage to pretty much constant in this example.

use rust_htslib::faidx;
use rand::{thread_rng, Rng};

fn main() {
    let fasta     = std::env::args().nth(1).expect("No reference provided");
    let reference = faidx::Reader::from_path(&fasta).unwrap();

    let mut rng = thread_rng();
    loop {
        // ---- Fetch a random chromosome, position and read length within a reference genome 
        let chr   = rng.gen_range(1..=22u32).to_string();
        let len   = rng.gen_range(10..50usize);
        let start = rng.gen_range(0..100usize - len);
        let refseq = reference.fetch_seq(chr, start, start + len).unwrap();
        println!("{}", unsafe{ std::str::from_utf8_unchecked(refseq)});
    }
}

A small public github repository containing the example above, along with a dummy reference file can be found here: rust-htslib-memleak-mre

Some thoughts:

Of course, I might be completely misusing the fetch_seq() method - hence the '?' in this issue's title. Regardless, be it a safety or documentation issue, I'm thinking this might need some resolving.

Anyway, Any help, insight, or suggestion regarding this matter would be greatly appreciated. Also, I'm always happy to contribute on a pull request if requested.

Cheers !

@jts
Copy link
Contributor

jts commented Jul 21, 2023

This looks like a bug to me (and running your example through valgrind). I'm not sure how to fix this without changing the interface to the fetch_seq function since it returns a slice. As a possible fix I removed the fetch_seq function and fixed the leak in fetch_seq_string in this commit: jts@6334e9a. There might be a better way though.

@dlaehnemann
Copy link
Member

dlaehnemann commented Jul 24, 2023

I guess the libc::free(ptr as *mut c_void); is the decisive change, right here:
jts@6334e9a#diff-dd23e4893230213542657967d1811dcd664d135f1bf0f41e1264875b193d086fR88

Wouldn't it be possible to keep the original setup and just do the free like this:

        let cseq = unsafe {
            let ptr = htslib::faidx_fetch_seq64(
                self.inner,                          //*const faidx_t,
                cname.as_ptr(),                      // c_name
                begin as htslib::hts_pos_t,          // p_beg_i
                end as htslib::hts_pos_t,            // p_end_i
                &mut (len_out as htslib::hts_pos_t), //len
            );
            s = ffi::CStr::from_ptr(ptr)
            libc::free(ptr as *mut c_void);
            s
        };

        Ok(cseq.to_bytes())

Also, maybe it makes sense to include the above MRE as a test case, to prevent regressions? But I'm not sure how big the test case is, and how easy it is to test for the memory leak... So it might be more important to get a fix in quickly...

@jts
Copy link
Contributor

jts commented Jul 24, 2023

I don't think that version works because ffi::CStr::from_ptr doesn't make a copy of ptr's data, it only wraps it. So it is no longer valid after the libc::free call. I played around with a few other ideas for keeping the interface the same, but they all required returning a slice referencing a local copy of the data, which the borrow checker didn't allow. That's not to say its impossible, I just wasn't able to make it work.

@dlaehnemann
Copy link
Member

Then what about:

        let cseq = unsafe {
            let ptr = htslib::faidx_fetch_seq64(
                self.inner,                          //*const faidx_t,
                cname.as_ptr(),                      // c_name
                begin as htslib::hts_pos_t,          // p_beg_i
                end as htslib::hts_pos_t,            // p_end_i
                &mut (len_out as htslib::hts_pos_t), //len
            );
            s = ffi::CStr::from_ptr(ptr).to_bytes()
            libc::free(ptr as *mut c_void);
            s
        };

        Ok(cseq)

Does the to_bytes() clone the actual data?

@dlaehnemann
Copy link
Member

And if this doesn't do the trick, maybe a .clone() can? It should be implemented on CStr, see:
https://doc.rust-lang.org/std/ffi/struct.CStr.html#impl-Clone-for-Box%3CCStr,+Global%3E

@jts
Copy link
Contributor

jts commented Jul 24, 2023

I don't think to_bytes() clones. I just tested it (either calling to_bytes() in the return, or the second version that does it in the unsafe block) and valgrind complains about reading the memory block after free was called.

This doesn't seem to work either:

[...]
            let s = ffi::CStr::from_ptr(ptr).to_bytes().clone();
            libc::free(ptr as *mut c_void);
            s
        };

        Ok(cseq)
    }

Maybe it is cloning the ptr's address, not the actual data

@MaelLefeuvre
Copy link
Author

Hi everyone and sorry for my late reply

I've tried my go at the source code, and concur with the observation that making fetch_seq() memory safe while keeping its signature intact might require a hint of rust sorcery (I'll have to dive back into the rustonomicon...)

My initial attemps involved trying to tie the lifetime of &[u8] to one of the input parameters (i.e.: N). But this strategy would introduce breaking changes anyway.

In any case, while I'm not a fan of the idea of removing fetch_seq() alltogether, I would argue that this function should at the very least be marked as unsafe, and its documentation should better mirror the one found within hstlib, since :

In other words, a documentation entry in the likes of this (see below) would -I hope- be a bit more transparent to users:

    /// Fetch the sequence as a byte array.
    /// /* Snip */
    ///
    /// # Safety
    /// The returned static slice is internally allocated by `malloc()` and should be manually destroyed
    /// by end users by calling `libc::free()` on it, e.g.: 
    /// ```no_run
    /// let r = Reader::from_path("reference.fa")?;
    /// unsafe {
    ///     let refseq = r.fetch_seq("chr1", 10, 20)?;
    ///     /* Do stuff... */
    ///     libc::free(refseq.as_ptr() as *mut std::ffi::c_void);
    /// }
    /// ```
    pub unsafe fn fetch_seq<N: AsRef<str>>(&self, name: N, begin: usize, end: usize) -> Result<&'static [u8]> {
        /* Snip */
    }

However, I would also argue that commiting to this signature, and thus ask users to manually handle memory kinda defeats the purpose of using Rust in the first place... I think a better long-term approach would be to provide with a safer(-ish) implementation, which introduces a smart pointer as a return value:

pub struct RefSeq(&'static [u8]);

impl Drop for RefSeq {
    fn drop(&mut self) {
        unsafe { libc::free(self.0.as_ptr() as *mut ffi::c_void)};
    }
}

impl std::ops::Deref for RefSeq {
    type Target = &'static [u8];

    fn deref(&self) -> &Self::Target {
        &self.0
    }
}

Using RefSeq, a safer implementation of fetch_seq() and fetch_seq_string() can be made available to user, while keeping the original implementation untouched

    /// Fetch the sequence as a byte array, wrapped within a smart pointer.
    ///
    /// # Arguments
    /// [...]
    /// 
    /// # Usage
    /// ```
    /// # use rust_htslib::faidx;
    /// let r = faidx::Reader::from_path("test/test_cram.fa").unwrap();
    /// let bytes  = r.fetch_seq_safe("chr3", 20, 26).unwrap();
    /// let refstr = std::str::from_utf8(*bytes).unwrap();
    /// assert_eq!(refstr, "ATTTCAG");
    /// ```
    pub fn fetch_seq_safe<N: AsRef<str>>(&self, name: N, begin: usize, end: usize) -> Result<RefSeq> {
        let bytes = unsafe { self.fetch_seq(name, begin, end)? };
        Ok(RefSeq(bytes))
    }

fetch_seq_string() could also benefit from using this newer implementation, which would resolve the underlying memory leak:

    pub fn fetch_seq_string<N: AsRef<str>>(&self, name: N, begin: usize, end: usize) -> Result<String> {
        let bytes = self.safe_fetch_seq(name, begin, end)?;
        Ok(std::str::from_utf8(*bytes).unwrap().to_owned())
    }

Testing out fetch_seq_safe()

I took the time to try out fetch_seq_safe() within my codebase, and can safely say that this strategy entirely resolves the issue I had. i.e.: see screen captures right below, still using heaptrack :

Timeline Bottom-up stacktrace
image image

(can confirm this behavior is equally displayed when patching my previously linked MRE)

I can also confirm, through a quick'n'dirty criterion benchmark, that this solution does not carry any negative impact on runtime performance:

fetch_seq() bench output fetch_seq_safe() bench output
fetch_seq criterion bench fetch_seq_safe criterion bench

Unit-testing the memory consumption of fetch_seq()

As a follow up to @dlaehnemann 's suggestion:

Also, maybe it makes sense to include the above MRE as a test case, to prevent regressions? But I'm not sure how big the test case is, and how easy it is to test for the memory leak... So it might be more important to get a fix in quickly...

I'm hoping the unit-test described below could work. However, the solution I'm providing is far from being perfect, given that it :

  • requires the addition of memory-stats as a dev-dependency.
  • requires the use of arbitrary parameters (test_duration and memory_leeway - hoping these variable names are self-explanatory), since memory-stats can sometimes be fairly inaccurate when estimating the resident set size of a software.
  • In terms of overall design, this is pretty much a dumb spin-loop which ensures the memory size remains generally constant for test_duration seconds... "Keep it stupid simple", as they say, but I'm sure a smarter and more accurate solution can be found.
    fn get_rss() -> Result<usize, &'static str>{
        memory_stats::memory_stats()
            .map(|usage| usage.physical_mem)
            .ok_or("Cannot access physical memory")
    }

    #[test]
    #[ignore] // cargo test memory_leak -- --ignored --test-threads=1
    fn fetch_seq_memory_leak() -> Result<(), &'static str> {
        std::thread::sleep(std::time::Duration::from_secs(5)); // warmup
        let test_duration = 15;      // seconds
        let memory_leeway = 500_000; // bytes
        let r             = open_reader();
        let base_rss      = get_rss()?;
        let start         = std::time::Instant::now();
        while (std::time::Instant::now() - start).as_secs() < test_duration {
            let bytes = r.fetch_seq_safe("chr1", 1, 120).unwrap();
            let seq   = std::str::from_utf8(&bytes).expect("Invalid UTF-8 sequence");
            assert!(get_rss()? - base_rss < memory_leeway);
        }
        Ok(())
    }

Notes :

  • All Changes and suggestions above can be found within MaelLefeuvre@643ecd
  • RefSeq and fetch_seq_safe() are probably not the best suited names.

Cheers!

@wdecoster
Copy link

Is there something here that I can do to get this fix implemented? I am seeing the same memory problem in my application.

@MaelLefeuvre
Copy link
Author

MaelLefeuvre commented Sep 3, 2023

Is there something here that I can do to get this fix implemented? I am seeing the same memory problem in my application.

I'd also be happy to contribute and/or submit a PR proposal if the current contributors of this project deem it acceptable.

In the meantime, if you don't wish to use a forked project, a simple temporary workaround for your application might be to manually call libc::free on your fetched fasta sequence, once not needed.

A pseudo code example:

loop {
    /* Setup code*/
    let refseq = reference.fetch_seq(chr, start, end)?;
    /* Do stuff with refseq */

    unsafe {libc::free(refseq.as_ptr() as *mut std::ffi::c_void)}; // Free up memory
}

You'll probably need to import libc within your Cargo.toml file, i.e.:

[dependencies]
  libc = "0.2"

Hope this helps, until a more stable solution to the issue is found.

Cheers

@wdecoster
Copy link

Excellent, thank you for the detailed solution!

wdecoster added a commit to wdecoster/STRdust that referenced this issue Nov 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants