Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aya: refactor handling of /proc/$pid/maps #719

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dave-tucker
Copy link
Member

@dave-tucker dave-tucker commented Aug 2, 2023

Since there are eyes on the uprobe code right now, I pulled this one out from #329.

This commit refactors the handling of /proc/$pid/maps since the collection previously assumed that all entries here could be represented in a HashMap. Those with a path component are stored in a HashMap for fast lookup via library name. All other entries are cached in a Vec to allow for filtering based on offsets, required for supporting USDT probes.


This change is Reviewable

@netlify
Copy link

netlify bot commented Aug 2, 2023

Deploy Preview for aya-rs-docs ready!

Name Link
🔨 Latest commit 127860d
🔍 Latest deploy log https://app.netlify.com/sites/aya-rs-docs/deploys/64d3bb9e337a0e0008963e8a
😎 Deploy Preview https://deploy-preview-719--aya-rs-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@mergify mergify bot added the aya This is about aya (userspace) label Aug 2, 2023
@mergify
Copy link

mergify bot commented Aug 2, 2023

Hey @alessandrod, this pull request changes the Aya Public API and requires your review.

@mergify mergify bot requested a review from alessandrod August 2, 2023 11:21
@mergify mergify bot added api/needs-review Makes an API change that needs review test A PR that improves test cases or CI labels Aug 2, 2023
Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: 0 of 1 files reviewed, 8 unresolved discussions (waiting on @dave-tucker)


aya/src/programs/uprobe.rs line 91 at r1 (raw file):

        let mut path = if let Some(pid) = pid {
            let proc_map_libs =
                ProcMap::new(pid).map_err(|e| UProbeError::ProcMapError { pid, source: e })?;

nit: |e| -> |source|


aya/src/programs/uprobe.rs line 95 at r1 (raw file):

                .find_by_name(target_str)
                .map_err(|io_error| UProbeError::FileError {
                    filename: format!("/proc/{}/maps", pid),

why change this? what's wrong with {pid}?


aya/src/programs/uprobe.rs line 236 at r1 (raw file):

    },

    /// There was en error resolving a path

nit: not pub, so doc comments are dead code. Can we remove them?


aya/src/programs/uprobe.rs line 415 at r1 (raw file):

/// Error reading from /proc/pid/maps
#[derive(Debug, Error)]
pub enum ProcMapError {

should we make this non-pub and have the relevant functions return dyn box errors? we're long talked about hiding errors from the public API.


aya/src/programs/uprobe.rs line 425 at r1 (raw file):

}

pub(crate) struct ProcMap {

why pub(crate)? only used in this file AFAICT.


aya/src/programs/uprobe.rs line 479 at r1 (raw file):

impl ProcMapEntry {
    fn parse(line: &str) -> Result<Self, ProcMapError> {

can we implement FromStr instead of this bespoke method?


aya/src/programs/uprobe.rs line 480 at r1 (raw file):

impl ProcMapEntry {
    fn parse(line: &str) -> Result<Self, ProcMapError> {
        let parts: Vec<&str> = line.split_whitespace().collect();

can we avoid the vector?

let mut parts = line.split_whitespace();
let next = || parts.next().ok_or(ProcMapError::ParseError);
let addr_parts = next()?.split(...);
let perms = next()?;
...

aya/src/programs/uprobe.rs line 484 at r1 (raw file):

            return Err(ProcMapError::ParseError);
        }
        let addr_parts: Vec<&str> = parts[0].split('-').collect();

ditto, avoid the vector. split_once?

@mergify
Copy link

mergify bot commented Aug 2, 2023

@dave-tucker, this pull request is now in conflict and requires a rebase.

@mergify mergify bot added the needs-rebase label Aug 2, 2023
@addisoncrump
Copy link
Contributor

addisoncrump commented Aug 2, 2023

Don't suppose we could get an iterator over the entries + getters for fields in this PR? + making these types pub, not pub(crate), as this machinery may be useful downstream.

@dave-tucker
Copy link
Member Author

Don't suppose we could get an iterator over the entries + getters for fields in this PR?

Sure. I'll address the comments here tomorrow and will see about adding an iterator and some getters here too.

@dave-tucker
Copy link
Member Author

@addisoncrump Can you check the API here and let me know if this is what you had in mind?

Copy link
Member Author

@dave-tucker dave-tucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 2 files reviewed, 6 unresolved discussions (waiting on @alessandrod and @tamird)


aya/src/programs/uprobe.rs line 95 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

why change this? what's wrong with {pid}?

Done.


aya/src/programs/uprobe.rs line 236 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

nit: not pub, so doc comments are dead code. Can we remove them?

Is pub now


aya/src/programs/uprobe.rs line 415 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

should we make this non-pub and have the relevant functions return dyn box errors? we're long talked about hiding errors from the public API.

Yes eventually, although I don't have a good picture in my mind what our Box future will look like exactly...
I'm hesitant to try it here since I'll need to redo all of UprobeError handling, which in turn links back into ProgramError and so on - it could very quickly snowball.


aya/src/programs/uprobe.rs line 425 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

why pub(crate)? only used in this file AFAICT.

Done.


aya/src/programs/uprobe.rs line 479 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

can we implement FromStr instead of this bespoke method?

Done.


aya/src/programs/uprobe.rs line 480 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

can we avoid the vector?

let mut parts = line.split_whitespace();
let next = || parts.next().ok_or(ProcMapError::ParseError);
let addr_parts = next()?.split(...);
let perms = next()?;
...

Done.


aya/src/programs/uprobe.rs line 484 at r1 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

ditto, avoid the vector. split_once?

Done.

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r1, 1 of 1 files at r2, 2 of 2 files at r3, all commit messages.
Reviewable status: all files reviewed, 11 unresolved discussions (waiting on @alessandrod and @dave-tucker)


aya/src/programs/uprobe.rs line 236 at r1 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

Is pub now

Ack. Can you please add periods to all the comments? They should be proper sentences.


aya/src/programs/uprobe.rs line 267 at r3 (raw file):

        source: ProcMapError,
    },
}

you could you put the ProcMap code here? that would reduce the total diff by a good bit.


aya/src/programs/uprobe.rs line 438 at r3 (raw file):

#[derive(Debug, Error)]
pub enum ProcMapError {
    /// An [`io::Error`]

repeating the type - maybe this variant should be called "ProcfsMapsError"?


aya/src/programs/uprobe.rs line 450 at r3 (raw file):

///
/// This is read from /proc/`pid`/maps.
/// The information here may be used to resolve addresses to paths

missing period - also, if you intend to separate these sentences into two paragraphs, there should be an empty comment line between them.


aya/src/programs/uprobe.rs line 476 at r3 (raw file):

    }

    // Find the full path of a library by its name.

ditto empty comment line


aya/src/programs/uprobe.rs line 495 at r3 (raw file):

    }

    /// Provides an iterator over all parsed memory map entries

nit: that this returns an iterator is stated by the signature. it'd be good to avoid repeating that


aya/src/programs/uprobe.rs line 518 at r3 (raw file):

impl ProcMapEntry {
    /// The start address of the mapped memory

all comments missing periods


aya/src/programs/uprobe.rs line 564 at r3 (raw file):

            .and_then(|(a, b)| {
                Some((
                    u64::from_str_radix(a, 16).ok()?,

do we want to retain the original error?


aya/src/programs/uprobe.rs line 569 at r3 (raw file):

            })
            .ok_or(ProcMapError::ParseError)?;
        let perms = next()?.to_string();

consider doing the string allocations at the end, when all fallible work has completed


aya/src/programs/uprobe.rs line 570 at r3 (raw file):

            .ok_or(ProcMapError::ParseError)?;
        let perms = next()?.to_string();
        let offset = u64::from_str_radix(next()?, 16).map_err(|_| ProcMapError::ParseError)?;

do we want to retain the original error? there's information in there


aya/src/programs/uprobe.rs line 573 at r3 (raw file):

        let dev = next()?.to_string();
        let inode = next()?.parse().map_err(|_| ProcMapError::ParseError)?;
        let path = next().ok().and_then(|s| {

instead of next().ok() here, using parts.next() to avoid the appearance of discarding an error?


aya/src/programs/uprobe.rs line 593 at r3 (raw file):

#[cfg(test)]
mod test {

error paths seem to not be covered

@addisoncrump
Copy link
Contributor

@addisoncrump Can you check the API here and let me know if this is what you had in mind?

Looks good. If you've got some time, an async version of new would also be useful for e.g. AsyncPerfEventArray, where this will likely often be used.

@tamird
Copy link
Member

tamird commented Aug 3, 2023

@addisoncrump Can you check the API here and let me know if this is what you had in mind?

Looks good. If you've got some time, an async version of new would also be useful for e.g. AsyncPerfEventArray, where this will likely often be used.

Async reading of /proc/{pid}/maps? Why would such code ever yield?

@addisoncrump
Copy link
Contributor

Why would such code ever yield?

Fair point, the async is probably not worth it. Disregard.

@addisoncrump
Copy link
Contributor

Is this ready to merge? 🙂

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like comments still need to be addressed.

Reviewable status: all files reviewed, 11 unresolved discussions (waiting on @alessandrod and @dave-tucker)

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @alessandrod and @dave-tucker)


aya/src/programs/uprobe.rs line 259 at r4 (raw file):

    },

    /// There was en error resolving a path.

what does this mean?


aya/src/programs/uprobe.rs line 261 at r4 (raw file):

    /// There was en error resolving a path.
    #[error("error fetching libs for {pid}")]
    ProcMapError {

maybe drop "Error" here too? see comment below


aya/src/programs/uprobe.rs line 441 at r4 (raw file):

    /// Unable to read /proc/pid/maps.
    #[error(transparent)]
    ReadError(io::Error),

can we remove the Error suffix from all the variants? https://rust-lang.github.io/rust-clippy/master/index.html#/enum_variant_names

Clippy isn't flagging this because this enum is pub, but since we're introducing it here, let's follow the advice.


aya/src/programs/uprobe.rs line 473 at r4 (raw file):

            if let Some(path) = &entry.path {
                let p = PathBuf::from(path);
                let filename = p.file_name().unwrap().to_string_lossy().into_owned();

why lossy here and below? but also should this just be a map<PathBuf, PathBuf> to avoid the fallible conversions?


aya/src/programs/uprobe.rs line 475 at r4 (raw file):

                let filename = p.file_name().unwrap().to_string_lossy().into_owned();
                let library_path = p.to_string_lossy().to_string();
                libraries.entry(filename).or_insert(library_path);

why .entry.or_insert? do we expect repetitions? are we testing that case?


aya/src/programs/uprobe.rs line 486 at r4 (raw file):

    // This isn't part of the public API since it's really only useful for
    // attaching uprobes.
    fn find_library_path_by_name(&self, lib: &str) -> Result<Option<String>, io::Error> {

why not return Option<&str> and let the caller clone if necessary?


aya/src/programs/uprobe.rs line 576 at r4 (raw file):

                let end = u64::from_str_radix(b, 16).map_err(ProcMapError::IntError);
                (start, end)
            })?;

should we check the errors right here rather than parse further? even more succinct would be:

let (address, adddress_end) = next()?.split_once('-').ok_or(ProcMapError::ParseError).and_then(|(start, end)| {
let start = u64::from_str_radix(start, 16)?;
let end = u64::from_str_radix(end, 16)?;
Ok((start, end))
})?;


aya/src/programs/uprobe.rs line 578 at r4 (raw file):

            })?;
        let perms = next()?;
        let offset = u64::from_str_radix(next()?, 16).map_err(ProcMapError::IntError)?;

map_err here can go away since you have a #[from]; the try operator will do the right thing.


aya/src/programs/uprobe.rs line 582 at r4 (raw file):

        let inode = next()?.parse().map_err(ProcMapError::IntError)?;
        let path = parts.next().and_then(|s| {
            if s.starts_with('/') {

s.starts_with('/').then(|| s.to_string())

Copy link
Member Author

@dave-tucker dave-tucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 10 unresolved discussions (waiting on @alessandrod and @tamird)


aya/src/programs/uprobe.rs line 438 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

repeating the type - maybe this variant should be called "ProcfsMapsError"?

Done.


aya/src/programs/uprobe.rs line 450 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

missing period - also, if you intend to separate these sentences into two paragraphs, there should be an empty comment line between them.

Done.


aya/src/programs/uprobe.rs line 476 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

ditto empty comment line

Done.


aya/src/programs/uprobe.rs line 518 at r3 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

all comments missing periods

Done.


aya/src/programs/uprobe.rs line 259 at r4 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

what does this mean?

Done.


aya/src/programs/uprobe.rs line 261 at r4 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

maybe drop "Error" here too? see comment below

Done.


aya/src/programs/uprobe.rs line 441 at r4 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

can we remove the Error suffix from all the variants? https://rust-lang.github.io/rust-clippy/master/index.html#/enum_variant_names

Clippy isn't flagging this because this enum is pub, but since we're introducing it here, let's follow the advice.

Done.


aya/src/programs/uprobe.rs line 473 at r4 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

why lossy here and below? but also should this just be a map<PathBuf, PathBuf> to avoid the fallible conversions?

I don't think we can be Map<PathBuf, PathBuf>
I need to map a filename (which is a library name) to it's full path...
Assuming I keep the PathBuf, the filename could then be.

let filename = p
                    .file_name()
                    .and_then(|s| s.to_str())
                    .map(|s| s.to_string())
                    .expect("library path should have a filename");

aya/src/programs/uprobe.rs line 475 at r4 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

why .entry.or_insert? do we expect repetitions? are we testing that case?

Done.


aya/src/programs/uprobe.rs line 486 at r4 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

why not return Option<&str> and let the caller clone if necessary?

Done.


aya/src/programs/uprobe.rs line 576 at r4 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

should we check the errors right here rather than parse further? even more succinct would be:

let (address, adddress_end) = next()?.split_once('-').ok_or(ProcMapError::ParseError).and_then(|(start, end)| {
let start = u64::from_str_radix(start, 16)?;
let end = u64::from_str_radix(end, 16)?;
Ok((start, end))
})?;

Done.


aya/src/programs/uprobe.rs line 578 at r4 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

map_err here can go away since you have a #[from]; the try operator will do the right thing.

Done.


aya/src/programs/uprobe.rs line 582 at r4 (raw file):

Previously, tamird (Tamir Duberstein) wrote…

s.starts_with('/').then(|| s.to_string())

Done.

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r5, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @alessandrod and @dave-tucker)


aya/src/programs/uprobe.rs line 473 at r4 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

I don't think we can be Map<PathBuf, PathBuf>
I need to map a filename (which is a library name) to it's full path...
Assuming I keep the PathBuf, the filename could then be.

let filename = p
                    .file_name()
                    .and_then(|s| s.to_str())
                    .map(|s| s.to_string())
                    .expect("library path should have a filename");

Actually, why even have a map at all? find_library_path_by_name is never able to use anything as a key for a precise lookup. If you want de-duplication, this may as well be a HashSet.


aya/src/programs/uprobe.rs line 588 at r5 (raw file):

        let offset = u64::from_str_radix(next()?, 16)?;
        let dev = next()?;
        let inode = next()?.parse().map_err(ProcMapError::ParseInt)?;

map_err can go away


aya/src/programs/uprobe.rs line 735 at r5 (raw file):

                .find_library_path_by_name("ld-linux-x86-64.so.2")
                .unwrap(),
            Some("/usr/lib64/ld-linux-x86-64.so.2")

what's being tested here? does it pick the first one? the last one? they're all the same.

/// This is read from /proc/`pid`/maps.
///
/// The information here may be used to resolve addresses to paths.
pub struct ProcMap {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't this be private for now? I understand that USDT might need it public at
some point, but can we flip it public when we merge that?

I don't think we have anything to lose from having it private for now. Instead
if we make it public now and by the time we merge USDT we need to change API for
whatever reason, then we're going to introduce yet another breaking change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's pub because @addisoncrump asked for it 😉
I'm happy to flip it to priv and let them convince you why it should be pub

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly to avoid duplicating code. Since the parsing code won't change (proc maps is standardised) it's good to expose it to users who need to access the proc maps data, esp. since it will be necessary in some uprobes (re: anything involving stacktrace).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I'm going to have to review the API 😋

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's priv now. Once this is in someone can open a PR and we can all bikeshed on the API 😆
My priority is to get the USDT work rebased and reviewable.

This commit refactors the handling of /proc/$pid/maps since the
collection previously assumed that all entries here could be
representeted in a HashMap. Those with a path component are stored
in a HashSet for fast lookup via library name. All other entries
are cached in a Vec to allow for filtering based on offsets, required
for supporting USDT probes.

Signed-off-by: Dave Tucker <dave@dtucker.co.uk>
Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r6, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @alessandrod and @dave-tucker)


aya/src/programs/uprobe.rs line 481 at r6 (raw file):

                p.file_name()
                    .and_then(|s| s.to_str())
                    .map(|s| s.starts_with(lib))

can this and the below be map_or(false, ...) to avoid unwrap_or_default? nit, but is more explicit that way


aya/src/programs/uprobe.rs line 488 at r6 (raw file):

                p.file_name()
                    .and_then(|s| s.to_str())
                    .map(|s| {

make this and_then, will remove one of the unwrap_or_default calls

@dave-tucker dave-tucker removed the api/needs-review Makes an API change that needs review label Aug 9, 2023
@mergify mergify bot added the api/needs-review Makes an API change that needs review label Aug 9, 2023
_offset: u64,
_dev: String,
_inode: u32,
path: Option<String>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path of the library is not guaranteed to be UTF-8.

}
}

impl FromStr for ProcMap {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not a valid assumption that the /proc/$pid/map is UTF-8 (and therefore representable by &str), since the library name can be non-UTF-8 (though this is very unlikely, it still is a potential use case).

Copy link
Member

@tamird tamird left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 6 unresolved discussions (waiting on @addisoncrump, @alessandrod, and @dave-tucker)


aya/src/programs/uprobe.rs line 457 at r5 (raw file):

Previously, dave-tucker (Dave Tucker) wrote…

It's priv now. Once this is in someone can open a PR and we can all bikeshed on the API 😆
My priority is to get the USDT work rebased and reviewable.

ProcMapError is still pub. @alessandrod TAL?


aya/src/programs/uprobe.rs line 505 at r6 (raw file):

Previously, addisoncrump (Addison Crump) wrote…

It is not a valid assumption that the /proc/$pid/map is UTF-8 (and therefore representable by &str), since the library name can be non-UTF-8 (though this is very unlikely, it still is a potential use case).

Sniped me into looking into this, I took a shot at boiling away all the UTF-8 assumptions in #742.

@mergify
Copy link

mergify bot commented Sep 14, 2023

@dave-tucker, this pull request is now in conflict and requires a rebase.

@mergify mergify bot added the needs-rebase label Sep 14, 2023
Copy link

mergify bot commented Feb 6, 2024

Hey @alessandrod, this pull request changes the Aya Public API and requires your review.

Copy link

mergify bot commented Feb 6, 2024

@dave-tucker, this pull request is now in conflict and requires a rebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api/needs-review Makes an API change that needs review aya This is about aya (userspace) needs-rebase test A PR that improves test cases or CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants