Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replacement for String::from_utf8 #73

Open
Nugine opened this issue Jan 15, 2023 · 4 comments
Open

Replacement for String::from_utf8 #73

Nugine opened this issue Jan 15, 2023 · 4 comments

Comments

@Nugine
Copy link

Nugine commented Jan 15, 2023

Currently there is no safe relpacement for String::from_utf8 in simdutf8. I think it is easy to add a function for this.

@dralley
Copy link

dralley commented Jul 27, 2023

That would be effectively the same as simdutf8::compat::from_utf8(value).and_then(|s| s.to_owned()), yes?

Note that there was some discussion in the past about putting it in the standard library directly: https://www.reddit.com/r/rust/comments/mvc6o5/incredibly_fast_utf8_validation/

@Nugine
Copy link
Author

Nugine commented Jul 27, 2023

Thanks for the answer!

@Nugine Nugine closed this as completed Jul 27, 2023
@Nugine
Copy link
Author

Nugine commented Jul 27, 2023

Ah I forgot the original problem.
String::from_utf8 converts Vec<u8> to String with validation. However, simdutf8 can check a slice but not a vec. You have to use String::from_utf8_unchecked to bypass an extra copy. So there's still no safe replacement for that.

@Nugine Nugine reopened this Jul 27, 2023
@Vrtgs
Copy link

Vrtgs commented Sep 16, 2023

Looking into the implementation of from_utf8 this should be quite easy to add

#[inline]
pub fn from_utf8(input: &[u8]) -> Result<&str, Utf8Error> {
    unsafe {
        validate_utf8_basic(input)?;
        Ok(from_utf8_unchecked(input))
    }
}

and we just add

pub mod string {
    pub use super::*;
	#[inline]
	pub fn from_utf8(input: Vec<u8>) -> Result<String, Utf8Error> {
    	unsafe {
        	validate_utf8_basic(&input)?;
        	Ok(String::from_utf8_unchecked(input))
    	}
	}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants