Add Value::Struct variant and preserve struct names and formatting #328

cswinter · 2021-10-26T15:08:09Z

This change adds a ~~dreadful hack~~creative workaround that allows us to distinguish between structs and maps during parsing and preserve struct names. It does so by abusing the size_hint method of the MapAccessor as a side channel that exposes information about the struct name and whether we are visiting a struct or a map. I am somewhat confident that, within the context of the current implementation, this is sound and won't unexpectedly blow up in our face. The CommaSeparated struct with the unusual size_hint method is passed directly to the ValueVisitor and drained before we return control flow to any serde code, so anything else that might call the size_hint method will just receive None values. I haven't confirmed this, but with some luck, size_hint will also get inlined after monomorphization and optimized into something efficient.

This makes progress towards #122, fixes #189, and fixes #264.

This change has been arrived at by sheer power of will rather than any kind of understanding of serde so there may be a better solution. There are still some smaller issues with the PR, before fixing those I first wanted to check whether this has any hope of getting merged.

cswinter · 2021-10-27T04:49:35Z

Another hitch I haven't figured out yet is that the serde serialize_struct method expects a name: &'static str which makes it unusable for Value::Struct variants with dynamic names: serde-rs/serde#2043

cswinter · 2021-10-27T05:19:47Z

I suppose it's easy enough to adapt the Serialize::serialize implementation to take a concrete ron::ser::Serializer which can actually have a more permissive lifetime and call that in to_string/to_pretty_string. It does mean that we couldn't actually correctly implement the serde::Serialize trait on Value, not sure much we care about that.

cswinter · 2021-10-27T05:37:19Z

Hmm ok actually a little more involved because of the recursion on the Compounds. Seems pretty manageable but not sure yet what the right approach is.

cswinter · 2021-10-27T06:19:39Z

Ok I think I got it. KeepingValue Serialize seems important if you wanted to e.g. convert ron to yaml, but we can still keep the existing implementation and just serialize Struct as map which retains the existing behavior and is simply the best we can do. And then to_string_pretty can just use an enhanced version of serialize with some moderate amount of code deduplication.

cswinter · 2021-10-27T16:19:43Z

Well, I got it working but since to_string is fully generic we can't actually do something custom for Value, so the only option I see is to have a separate version of the methods just for Value. Which could still be useful but somewhat unsatisfying, not sure if there's a way to salvage this.

Clippy

Struct names

kvark · 2021-11-02T03:12:38Z

Sorry, I've been punting on reviewing this.
@torkleyy is this something you'd be comfortable reviewing?

cswinter · 2021-11-02T03:30:26Z

So I did get this to work and am using it in https://github.com/cswinter/pyron to (de)serialize Python dataclasses as ron, but there are still some issues with this that I'm not sure yet if they can be fixed. Both the ValueVisitor used in the Deserialize implementation of Value, and the Deserializer fail to conform to the expected serde interface and are public which might cause some issues if you use them in combination with a different Deserialize/Deserializer.
Particularly, if you try to deserialize something else (e.g. yaml) into a ron::Value, the ValueVisitor won't behave correctly and do weird things if it receives size hints. We could possibly fix this by further changing the ron Deserializer to emit a sentinel value of usize::MAX as the first size_hint (which would never be emitted by a valid Deserializer) and allows the ValueVisitor to work normally if it doesn't see this sentinel. However, this would now break the Deserializer, and actually, my current implementation might already be broken for deserializing into structs. There could still be some way around this but I'm not sure yet if that's the way to go. It might be better to just write a new parser from scratch that is not tied to serde, which I'm considering anyway since it enables some other features that are not currently possible.

The serialization implementation that allows preserving struct names/tuples is in better shape. It requires some amount of code duplication, and unfortunately needs an explicit call to a new Value::to_string/to_pretty_string method so we can call our custom serialize method that doesn't quite work like the serde one, but I'm not aware of any glaring soundness issues. It's a lot less useful with the corresponding deserialization code of course.

cswinter · 2021-11-02T03:47:05Z

For now, my main use case for this is https://github.com/cswinter/pyron, which doesn't use ron in a way that would run into the remaining issues. I would like to fix this properly at some point but still need to decide on the right approach and exactly what features I want so I'm planning to just let this stew for a while. The main approaches I'm considering:

come up with one more clever idea to make this hack work
implement custom deserializer (and maybe also pretty printer) which isn't tied to serde
get serde to add some new methods for us that allow an implementation without hacks (I think this could be done in a way that doesn't break existing methods by adding alternative variants with dynamic lifetime etc. which have a default implementation that returns a specific error. Existing serializer would just never call these, and our serializer could call these to use the extra functionality when available, or revert to the current behavior for Deserializer/Deserialize implementations that don't support this functionality.)

If anyone has any other good ideas or opinions on how this should be approached I'd be glad to hear about them.

torkleyy · 2021-11-02T08:56:58Z

Sorry, I've been punting on reviewing this. @torkleyy is this something you'd be comfortable reviewing?

I might be able to next week, feel free to ping me if I forget.

github-actions · 2022-01-01T15:37:04Z

Issue has had no activity in the last 60 days and is going to be closed in 7 days if no further activity occurs

kvark · 2022-01-06T04:34:16Z

@torkleyy ping

juntyr · 2023-08-20T16:03:15Z

I've been thinking about a more AST-like Value type again and this is a sketch of what a NewValue might look like:

pub enum NewValue<S: std::borrow::Borrow<str>> {
    Unit,
    Bool(bool),
    Number(Number),
    Char(char),
    String(String),
    Bytes(Vec<u8>),
    List(Vec<NewValue<S>>),
    Map(NewMap<S>),
    NamedUnit {
        name: S,
    },
    Newtype(Box<NewValue<S>>),
    NamedNewtype {
        name: S,
        inner: Box<NewValue<S>>,
    },
    Tuple(Vec<NewValue<S>>),
    NamedTuple {
        name: S,
        elems: Vec<NewValue<S>>,
    },
    Struct {
        fields: FieldMap<S>,
    },
    NamedStruct {
        name: S,
        fields: FieldMap<S>,
    }
}

#[cfg(not(feature = "indexmap"))]
type NewMap<S> = std::collections::BTreeMap<NewValue<S>, NewValue<S>>;
#[cfg(feature = "indexmap")]
type NewMap<S> = indexmap::IndexMap<NewValue<S>, NewValue<S>>;

#[cfg(not(feature = "indexmap"))]
type FieldMap<S> = std::collections::BTreeMap<S, NewValue<S>>;
#[cfg(feature = "indexmap")]
type FieldMap<S> = indexmap::IndexMap<S, NewValue<S>>;

I think the biggest questions are:

how to represent Option, since the unwrap newtype variants feature includes Some and thus all of the following are allowed: None, Some(val), Some() = Some(()), Some(a, b) = Some((a, b)), and Some(a: a, b: b) = Some((a: a, b: b)). On a syntactic level, None is just a named ident and Some is just a named newtype / tuple / struct with special handling on serde's side
how to handle extensions - maybe we could support attributes more generally around every value subtree?
how to handle newtypes, since ron sometimes erases them

rlidwka · 2023-12-25T11:52:23Z

I've been thinking about a more AST-like Value type again

Is there any progress made on this?

I'd like to have lossless Value, which I can have multiple attempts at trying to deserialize from for untagged enums. Trying to make something like this work:

#[derive(Debug, serde::Serialize, serde::Deserialize)]
enum A {
    A1(String),
    A2(String),
}

#[derive(Debug, serde::Serialize, serde::Deserialize)]
enum B {
    B1(String),
    B2(String),
}

#[derive(Debug, serde::Serialize, serde::Deserialize)]
#[serde(untagged)]
enum MyEnum {
    A(A),
    B(B),
}

let value = MyEnum::A(A::A1("Hello, world!".to_string()));

// valid RON serializations are:
// A1("")
// A2("")
// B1("")
// B2("")

// works fine with serde_json
let json = serde_json::to_string_pretty(&value).unwrap();
dbg!(serde_json::from_str::<MyEnum>(&json).unwrap());

// but doesn't work with ron
let ron = ron::ser::to_string_pretty(&value, ron::ser::PrettyConfig::default()).unwrap();
dbg!(ron::de::from_str::<MyEnum>(&ron).unwrap());

juntyr · 2023-12-25T15:59:18Z

Is there any progress made on this?

Not yet, unfortunately. Once I am back from parental leave, I will first tackle finally publishing v0.9, which includes a list of all the weird serde attribute cases that we currently cannot support because serde treats every format like JSON. If serde were to ever change that (and e.g. impl Trait in type aliases in traits makes that technically possible), an improved version of Ron’s Value type (as outlined above) would be used to fix most of these cases.

While it will take me a few more months to get back to this PR, feel free to implement something similar to my idea sketched out above (essentially an AST-like Value type that allows round trips of RON [without serde attributes] but is more an AST than a Value enum since e.g. named structs and enum variants are ambiguous in RON when we are not given type information by the user).

cswinter and others added 7 commits October 27, 2021 19:31

Move struct_names from Serializer to PrettyConfig

c48d805

Update changelog

7780d22

Fix some clippy warnings

631c0b2

Fix some more clippy warnings

e611b27

Merge pull request #1 from cswinter/clippy

a9b23e1

Clippy

Merge pull request #2 from cswinter/struct-names

0ce1426

Struct names

Add Value::Struct variant and preserve struct names and formatting

3669ecf

cswinter force-pushed the struct-hack branch 2 times, most recently from 80e938a to 91fdaf1 Compare October 28, 2021 03:23

Add Value::to_string{,pretty} methods that preserve structs

b85bf64

cswinter force-pushed the struct-hack branch from 91fdaf1 to b85bf64 Compare October 28, 2021 04:11

github-actions bot added the stale label Jan 1, 2022

github-actions bot removed the stale label Jan 6, 2022

juntyr marked this pull request as draft January 28, 2022 17:21

juntyr mentioned this pull request Aug 15, 2022

Value meta-issue #397

Open

17 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Value::Struct variant and preserve struct names and formatting #328

Add Value::Struct variant and preserve struct names and formatting #328

cswinter commented Oct 26, 2021 •

edited

cswinter commented Oct 27, 2021

cswinter commented Oct 27, 2021 •

edited

cswinter commented Oct 27, 2021 •

edited

cswinter commented Oct 27, 2021

cswinter commented Oct 27, 2021 •

edited

kvark commented Nov 2, 2021

cswinter commented Nov 2, 2021 •

edited

cswinter commented Nov 2, 2021

torkleyy commented Nov 2, 2021

github-actions bot commented Jan 1, 2022

kvark commented Jan 6, 2022

juntyr commented Aug 20, 2023

rlidwka commented Dec 25, 2023

juntyr commented Dec 25, 2023

Add Value::Struct variant and preserve struct names and formatting #328

Are you sure you want to change the base?

Add Value::Struct variant and preserve struct names and formatting #328

Conversation

cswinter commented Oct 26, 2021 • edited

cswinter commented Oct 27, 2021

cswinter commented Oct 27, 2021 • edited

cswinter commented Oct 27, 2021 • edited

cswinter commented Oct 27, 2021

cswinter commented Oct 27, 2021 • edited

kvark commented Nov 2, 2021

cswinter commented Nov 2, 2021 • edited

cswinter commented Nov 2, 2021

torkleyy commented Nov 2, 2021

github-actions bot commented Jan 1, 2022

kvark commented Jan 6, 2022

juntyr commented Aug 20, 2023

rlidwka commented Dec 25, 2023

juntyr commented Dec 25, 2023

cswinter commented Oct 26, 2021 •

edited

cswinter commented Oct 27, 2021 •

edited

cswinter commented Oct 27, 2021 •

edited

cswinter commented Oct 27, 2021 •

edited

cswinter commented Nov 2, 2021 •

edited