Use #[repr(C)] HList's to infer type-erased fmt fn pointers in format_args!'s static data. #44343

eddyb · 2017-09-05T16:27:00Z

Right now format_args! uses, e.g. ArgumentV1::new(&runtime_data, Debug::fmt) (for {:?}), at runtime, using up two pointers per argument at runtime instead of just one (&runtime_data).

With allow_internal_unsafe and #44240, we can place the (e.g. Debug::fmt) fn pointers in (rvalue-promoted) 'static data, the remaining hurdle is how to infer the type of the runtime data.
That is, Debug::fmt is really <_ as Debug>::fmt and that _ is right now inferred because of ArgumentV1::new's signature typing them together. If they're separate, we need something new.

I propose using the HList pattern (struct HCons<H, T>(H, T); struct HNil; - so for 3 elements, of types A, B and C you'd have HCons<A, HCons<B, HCons<C, HNil>>>), with #[repr(C)], which would give it a deterministic layout which matches that of an array, that is, these two:

&'static HCons<fn(&A), HCons<fn(&B), HCons<fn(&C), HNil>>>
&'static [unsafe fn(*const Opaque); 3]

have the same representation, and the latter can be unsized into a slice. This transformation from HList to array (and then slice) can be performed on top of a safe, rvalue-promoted HCons, which is a necessary requirement for moving the fn pointers into 'static data at all.

For inference, we can simply insert some function calls to match up the types, e.g. to infer B we could dofmt::unify_fn_with_data((list.1).0, &b), which would makeB into typeof b.

It might actually be simpler to have a completely safe "builder" interface, which combines the HList of formatters with a HList of runtime references, unifying the types, but I'm a bit worried about compile-times due to all the trait dispatch - in any case, the impact should be measured.

The text was updated successfully, but these errors were encountered:

m-ou-se · 2020-11-05T00:49:33Z

@rustbot claim

eddyb · 2020-11-05T01:03:46Z

For inference, we can simply insert some function calls to match up the types, e.g. to infer B we could do fmt::unify_fn_with_data((list.1).0, &b), which would make B into typeof b.

Not sure what I was thinking there, it should be much easier than that!

struct ArgMetadata<T: ?Sized> {
    // Only `unsafe` because of the later cast we do from `T` to `Opaque`.
    fmt: unsafe fn(&T, &mut Formatter<'_>) -> Result,
    // ... flags, constant string fragments, etc.
}

// TODO: maybe name this something else to emphasize repr(C)?
#[repr(C)]
struct HCons<T, Rest>(T, Rest);

// This would have to be in a "sealed module" to make it impossible to implement on more types.
trait MetadataFor<D> {
    const LEN: usize;
}
impl MetadataFor<()> for () {
    const LEN: usize = 0;
}
impl<'a, T: ?Sized, D, M> MetadataFor<HCons<&'a T, D>> for HCons<ArgMetadata<T>, M>
    where M: MetadataFor<D>
{
    const LEN: usize = M::LEN;
}

impl<'a> Arguments<'a> {
    fn new<M, D>(meta: &'a M, data: &'a D) -> Self
        where M: MetadataFor<D>
    {
        Self {
            meta: unsafe { &*(meta as *const _ as *const [ArgMetadata<Opaque>; M::LEN]) },
            data: unsafe { &*(data as *const _ as *const [&Opaque; M::LEN]) },
        }
    }
}

i.e. we build two HLists "in parallel", one with entirely constant metadata, and the other with the references to the runtime data, and then all of the type inference can come from the where clause on fmt::Arguments::new, with zero codegen cruft!

EDIT: @m-ou-se had to remind me why I went with the explicit inference trick in the first place: random access arguments 😞
(maybe with enough const generics abusage we could have D: IndexHList<i, Output = T> but that's a lot of effort)

m-ou-se · 2020-11-05T10:20:02Z

I have a new implementation of fmt::Arguments that is only two pointers in size by using a new form of 'static metadata that contains both the string pieces and any formatting options (if any). (So it now fits in a register pair, which is really nice.) It also only requires one pointer on the stack per argument instead of two, as @eddyb apparently already suggested three years ago in this issue. ^^ (Completely missed this issue until @eddyb pointed it out yesterday. ^^')

What's still left is updating format_args!() to produce this new type instead, which will run into the problem of splitting the object pointer and function pointer (which are currently together in ArgumentV1), as the function pointer should now go in the 'static metadata instead. The suggestion in this issue looks like a good way to do that. Will try to implement that soon. Looking forward to the perf run :)

m-ou-se · 2023-12-05T17:26:26Z

@lcnr shared a fun hack idea: using loop { break a; break b; } to get a (const-promotable) value a but get the type from b.

That'd allow this approach:

&(
    loop { break Display::fmt; break fmt_signature(&arg0); },
    loop { break Display::fmt; break fmt_signature(&arg1); },
    loop { break Display::fmt; break fmt_signature(&arg2); },
    loop { break Display::fmt; break fmt_signature(&arg3); },
)

Where fmt_signature is written as:

fn fmt_signature<T>(&T) -> fn(&T, &mut Formatter) -> fmt::Result {
    loop {}
}

joboet · 2023-12-05T17:55:47Z

I proposed the exact same solution as above because I was to lazy to read the issue before posting...

alexcrichton added C-enhancement Category: An issue proposing an enhancement or a PR with one. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Sep 7, 2017

jonas-schievink added the I-heavy Issue: Problems and improvements with respect to binary size of generated code. label Aug 19, 2019

rustbot assigned m-ou-se Nov 5, 2020

m-ou-se added the A-fmt Area: std::fmt label Nov 5, 2020

m-ou-se mentioned this issue Jul 7, 2022

Tracking issue for improving std::fmt::Arguments and format_args!() #99012

Open

46 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use #[repr(C)] HList's to infer type-erased fmt fn pointers in format_args!'s static data. #44343

Use #[repr(C)] HList's to infer type-erased fmt fn pointers in format_args!'s static data. #44343

eddyb commented Sep 5, 2017 •

edited

m-ou-se commented Nov 5, 2020

eddyb commented Nov 5, 2020 •

edited

m-ou-se commented Nov 5, 2020

m-ou-se commented Dec 5, 2023 •

edited

joboet commented Dec 5, 2023 •

edited

Use #[repr(C)] HList's to infer type-erased fmt fn pointers in format_args!'s static data. #44343

Use #[repr(C)] HList's to infer type-erased fmt fn pointers in format_args!'s static data. #44343

Comments

eddyb commented Sep 5, 2017 • edited

m-ou-se commented Nov 5, 2020

eddyb commented Nov 5, 2020 • edited

m-ou-se commented Nov 5, 2020

m-ou-se commented Dec 5, 2023 • edited

joboet commented Dec 5, 2023 • edited

eddyb commented Sep 5, 2017 •

edited

eddyb commented Nov 5, 2020 •

edited

m-ou-se commented Dec 5, 2023 •

edited

joboet commented Dec 5, 2023 •

edited