Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse default Apple deployment target from SDK properties #1009

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

BlackHoleFox
Copy link
Contributor

@BlackHoleFox BlackHoleFox commented Mar 13, 2024

This implements a slimmed-down version of clang's default version parsing to obtain default deployment targets in Apple's ecosystem. As opposed to using xcrun, this gets a correct value in all XCode installations and should be more reliable since its the same thing clang does (which we emulate in some places).

It also means compilation for Apple targets on non-macOS hosts (with the right SDK) should get the right version again too (no more xcrun), even if that's not officially supported.

Closes #1001 and #963

Copy link
Collaborator

@NobodyXu NobodyXu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Would it makes sense to parse them before build and generate a .rs file, like #1004 does?

If that's not possible, then I'd like you to use json-deserializer or tinyjson, instead of putting a json implementation into cc-rs

@BlackHoleFox
Copy link
Contributor Author

@NobodyXu Yeah those could work. I was trying to keep the build time for cc down by going for a minimally-compliant JSON approach.

For a cursory review tinyjson is 1008 lines of source and json-deserializer 507. However json-deserializer doesn't handle Unicode at all so it seems like a bad choice if cc ever needs to read something else that isn't XCode SDKs.

For build times, it looks like the vendored JSON implementation is the fastest. For reference main takes 0.7s total to build on my M1 Max laptop according to cargo build --timings. tinyjson adds 0.15s of time to the build. smoljson's minimal vendoring is ~3x faster to build it seems. tinyjson ends up becoming ~36% of the crates build time :rip:

Here's main:
image

and the with the JSON handlers:

JSON Handling Library cc Build Time main delta Graph
tinyjson 0.9s (N=3) ~0.15s image
Vendored smoljson. 0.6s (N=3). ~0.05s image

@NobodyXu
Copy link
Collaborator

Thanks, so the in-house json implementation is actually the fastest.

@BlackHoleFox Would it be possible to parse them before build and generate a .rs file, like #1004 does?

I know that it's probably a bit hard and might be impossible since the DefaultDeploymentTarget seems to be coupled with the sdk version they actually use.

@NobodyXu
Copy link
Collaborator

NobodyXu commented Mar 17, 2024

@BlackHoleFox If that's not possible, then I suggest further simplify the json parser by cutting unused features.

From the code, it seems that only Reader::read_str_from_object is used, with nesting for macOS_iOSMac.

I'd suggest remove all code except for the one handling objects, null and string, since we don't care about arrays or other stuff.

I'd suggest parsing the entire json file into a

enum Value<'json> {
    Object(BTreeMap<&'json str, Value>),
    String(&'json str),
    Null,
}

Using &str is ok, because the key and value we want does not have escape characters.
Any element that have escape characters, can be just ignored and not handled at all.

We also don't need validation of json, we don't care about that, the json provided by macOS sdk should always be valid.
As long as we can get DefaultDeploymentTarget and VersionMap.macOS_iOSMac, we don't care about other fields (for now).

Parsing the entire file into a Value like that, would also prevent the need to parse the json twice.

@BlackHoleFox
Copy link
Contributor Author

@NobodyXu I can trim it down further, yeah. For string escaping though, I disagree with removing support for that. We can have a minimally-useful JSON parser in cc-rs but escapes are part of the specification so I think they should stay. There are even other fields in the SDK JSON that have escapes.

Do you really think its worth making it basically nothing just to assume the XCode JSON is valid?

@NobodyXu
Copy link
Collaborator

For string escaping though, I disagree with removing support for that. We can have a minimally-useful JSON parser in cc-rs but escapes are part of the specification so I think they should stay. There are even other fields in the SDK JSON that have escapes.

I'm ok keeping the string escapes, I just want to keep cc-rs maintainable.

Last time I tried introducing a jobserver impl into cc-rs and I ended up removing it, so I prefer the json implementation to be simple.

So how about parsing the json into:

enum Value<'json> {
    Object(BTreeMap<Cow<'json, str>, Value>),
    String(Cow<'json, str>),
    Boolean(bool),
    Null,
}

If we ever need to get more fields (e.g. array, integer, float) then we can add that in future, though I'm pretty sure we won't ever need to parse float.

@BlackHoleFox
Copy link
Contributor Author

I can certainly try, but I'm bad at writing parsers so we will see how well "parse this blob into an ::Object" goes.

@cavivie
Copy link
Contributor

cavivie commented Mar 27, 2024

@NobodyXu BTW, do we think that SDKSettings.json always exists in the sdk root path? If not, do we fallback to the xcrun method (xcrun --show-sdk-version), or is there a flag (such as a variable) to control which method cc-rs is forced to use? At the same time, having the control flag means that when we make a mistake in judgment, we can allow downstream to fix the current build faster without having to wait for the cc-rs repair patch.

@NobodyXu
Copy link
Collaborator

That's a really good question that I have no answer with.

cc @BlackHoleFox

@BlackHoleFox
Copy link
Contributor Author

Yes they always exist inside XCode and it's CommandLineTools variants even past what stable Rust supports.

If you use a weird/mangled build environment you can just set the deployment target manually and this codepath will be skipped.

@BlackHoleFox
Copy link
Contributor Author

@NobodyXu I wasn't up to being able to figure out the nested object parsing, apologies.

I did trim down more unused stuff from the JSON parser though so its now <500 lines. I don't really think this code will need touched but if you want to have a try at this feel free. I don't think removing types from the parsers capabilities simplifies anything though because you still have to know what integers etc look like to skip/error on them.

let default_deployment_from_sdk = || {
let is_catalyst = matches!(sdk_parts.arch, AppleArchSpec::Catalyst(_));
// Needs to be both to distinguish between two targets inside the same SDK, like catalyst in the mac SDK.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Needs to be both to distinguish between two targets inside the same SDK, like catalyst in the mac SDK.
// Needs to distinguish between two targets inside the same SDK, like catalyst in the mac SDK.

.read_str_from_object("DefaultDeploymentTarget", None)
.ok()
.or_else(|| {
self.cargo_output.print_warning(&format_args!(
Copy link
Collaborator

@NobodyXu NobodyXu Apr 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There're quite a few map_err that makes them hard to read.

Can you extract them into a function that returns Result, then match on the return value of that function, log the error and convert the result to Option there?

@NobodyXu
Copy link
Collaborator

NobodyXu commented Apr 15, 2024

@NobodyXu I wasn't up to being able to figure out the nested object parsing, apologies.

I did trim down more unused stuff from the JSON parser though so its now <500 lines.

Thank you, cc @thomcc do you have any idea on how we can go further on simplying this json parser?

Copy link
Collaborator

@NobodyXu NobodyXu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish that we have a large stdlib so we don't have to re-invent the wheels

}

#[derive(Debug, Clone, Copy, PartialEq)]
pub struct Error(());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you could put a str there to make the error more readable:

Suggested change
pub struct Error(());
pub struct Error(Cow<'static, str>);

src/json.rs Show resolved Hide resolved
}

fn bnext_if(&mut self, b: u8) -> bool {
if self.pos < self.bytes.len() && self.bytes[self.pos] == b {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could use self.bytes.get here

}

fn bnext(&mut self) -> Option<u8> {
if self.pos < self.bytes.len() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use self.bytes.get

Ok(acc)
}

fn read_hex_escape(&mut self) -> Result<()> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is quite complex, I don't think we need hex, could we simplify this somehow, since we don't care about its value?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the complexity comes from the handling of unpaired surrogates, and turning them into (the correct number of) replacement characters. We can just make that an error.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That'd be nice

}
}

fn skip_line_comment(&mut self) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment stuff is an extension and can be removed.

pub(crate) enum Token<'s> {
Null,
Bool(bool),
NumU(u64),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just use Num(f64), I think. The i64/u64 is because I had a special use for that code.

NumU(u64),
NumI(i64),
NumF(f64),
StrBorrow(&'s str),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can combine this and the next one as Str(Cow<'s, str>)

Copy link
Contributor

@madsmtm madsmtm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that SDKSettings.json is only available since the macOS 10.14 SDK (not that big of a problem, except that you can actually only compile for 32-bit macOS using a 10.13 SDK). The file SDKSettings.plist has been available since the 10.0 SDK.

Additionally, I think it's a really bad idea to pull in own json parser.

So maybe we could resolve both of these concerns by shelling out to plutil? E.g. the following can be used to extract a version from the version map:

plutil -extract VersionMap.macOS_iOSMac."14\.3" raw /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/SDKSettings.plist

@NobodyXu
Copy link
Collaborator

So maybe we could resolve both of these concerns by shelling out to plutil? E.g. the following can be used to extract a version from the version map:

According to the man page of plutil it's only available in macOS 10.2 and later:

The plutil command first appeared in macOS 10.2.

But it indeed has better support than SDKSettings.json

@BlackHoleFox
Copy link
Contributor Author

Huh indeed, I guess I miscounted supported macOS host versions to XCode releases when I started on this PR.

I will start on replacing this with plutil since that's what people want but this is going to add an external binary dependency for anyone cross-compiling from Linux/BSD -> macOS. It looks like Debian, Arch, etc all have plutil somewhere though. I guess they already have enough edge cases though, maybe?

@BlackHoleFox
Copy link
Contributor Author

They could also just set MACOSX_DEPLOYMENT_TARGET in their build env to skip this parsing but no one actually does that in practice and we can't break that (which is why this PR exists).

@BlackHoleFox
Copy link
Contributor Author

It looks like Debian, Arch, etc all have plutil somewhere though.

This actually seems like its a badly written replacement for plutil and not a complete alternative. All the man pages don't have the extract flag anywhere. If we force people cross-compiling to set MACOSX_DEPLOYMENT_TARGET this becomes a non-issue though. What do you think @NobodyXu? @thomcc. I have the plutil changes made locally and can push them up if we agree on a line in the sand for cross-compilers.

@NobodyXu
Copy link
Collaborator

If we force people cross-compiling to set MACOSX_DEPLOYMENT_TARGET this becomes a non-issue though.

The question is if anyone does cross compiling without setting it though.

I don't have any data for it.

@thomcc
Copy link
Member

thomcc commented Apr 21, 2024

I think it's fine to ask them to set it when cross-compiling. I'd care if we broke it without a way forward (e.g. if we had a hard dep on plutil without an env var workaround), but this seems fine to me.

@NobodyXu
Copy link
Collaborator

plutil seems to be only available in macOS 10.2 and later man page of plutil:

The plutil command first appeared in macOS 10.2.

Maybe we can fallback to the xcrun if plutil's not available?

@madsmtm
Copy link
Contributor

madsmtm commented Apr 22, 2024

The minimum supported macOS version for rustc is 10.12, so I'd say that's a non-issue.

@NobodyXu
Copy link
Collaborator

NobodyXu commented Apr 22, 2024

The minimum supported macOS version for rustc is 10.12, so I'd say that's a non-issue.

Thanks, that's good to hear, though IIRC we got issues submit for macOS older than that, so I'm not 100% sure it is OK. my memory seems to be off, so it should be ok.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants