Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does rust-imap parse things twice? #78

Open
jonhoo opened this issue Nov 23, 2018 · 2 comments
Open

Does rust-imap parse things twice? #78

jonhoo opened this issue Nov 23, 2018 · 2 comments
Labels
help wanted Extra attention is needed performance For when we want to make things go fast! tombstone A reference to an old issue on mattnenterprise/rust-imap
Milestone

Comments

@jonhoo
Copy link
Owner

jonhoo commented Nov 23, 2018

It seems like a common pattern in the codebase is to use run_command_and_read_response, and then parse its output. However, read_response already fully parses the response, its just that the results of the parsing seems to be discarded https://github.com/mattnenterprise/rust-imap/blob/master/src/client.rs#L521.

Assuming my reading of the code is correct, is there a reason it's laid out like this, as opposed to read_response returning a Vec of the parsed responses or similar?

See the original issue here: mattnenterprise/rust-imap#78

@jonhoo jonhoo added tombstone A reference to an old issue on mattnenterprise/rust-imap enhancement New feature or request help wanted Extra attention is needed labels Nov 23, 2018
@jonhoo jonhoo added this to the imap 1.0 milestone Nov 23, 2018
@jonhoo jonhoo added performance For when we want to make things go fast! and removed enhancement New feature or request labels Nov 24, 2018
@Freyert
Copy link

Freyert commented Oct 14, 2023

This seems like an interesting thing to look into, but first what are we looking into?

We can see every time run_command_and_read_response is called we are generally executing another parsing function on the returned response.

rg -i -A3 run_command_and_read_response

rust-imap/src/client.rs

Lines 353 to 358 in 5d0d2f9

pub fn capabilities(&mut self) -> Result<Capabilities> {
// Create a temporary channel as we do not care about out of band responses before login
let (mut tx, _rx) = mpsc::channel();
self.run_command_and_read_response("CAPABILITY")
.and_then(|lines| Capabilities::parse(lines, &mut tx))
}

read_response parses the response once to assess if there were any errors (all the way to the end), and then Capabilities::parse parses the entire response again to identify the capabilities therein. (I believe)

The ideal situation would be able to compose different parsers together and iterate over the stream just once. That's just my current understanding, but I haven't written too much Rust yet so still getting my bearings.

read_response_onto being the one responsible for reading through the whole response once. After which subsequent parsers will iterate through the whole response again.

rust-imap/src/client.rs

Lines 1521 to 1525 in 5d0d2f9

pub(crate) fn read_response_onto(&mut self, data: &mut Vec<u8>) -> Result<usize> {
let mut continue_from = None;
let mut try_first = !data.is_empty();
let match_tag = format!("{}{}", TAG_PREFIX, self.tag);
loop {

@Freyert
Copy link

Freyert commented Oct 18, 2023

imap-proto makes heavy use of nom and streaming parsing to handle parsing the streams.

It's likely that we can get rid of readline by building on nom as well:

rust-imap/src/client.rs

Lines 1606 to 1621 in 5d0d2f9

pub(crate) fn readline(&mut self, into: &mut Vec<u8>) -> Result<usize> {
use std::io::BufRead;
let read = self.stream.read_until(LF, into)?;
if read == 0 {
return Err(Error::ConnectionLost);
}
if self.debug {
// Remove CRLF
let len = into.len();
let line = &into[(len - read)..(len - 2)];
eprintln!("S: {}", String::from_utf8_lossy(line));
}
Ok(read)
}

Theoretically there's a part of the stream that is "general control", some that's "specific control for the request", and then finally the data we want to parse. We would want to be able to compose these parsers as so: general_control(specific_control(data)) and that should do parsing in one pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed performance For when we want to make things go fast! tombstone A reference to an old issue on mattnenterprise/rust-imap
Projects
None yet
Development

No branches or pull requests

2 participants