Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we add support to ignore cycles on serialization? #40099

Closed
Jozkee opened this issue Jul 29, 2020 · 51 comments · Fixed by #46101
Closed

Should we add support to ignore cycles on serialization? #40099

Jozkee opened this issue Jul 29, 2020 · 51 comments · Fixed by #46101
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-System.Text.Json Cost:S Work that requires one engineer up to 1 week feature-request Priority:1 Work that is critical for the release, but we could probably ship without
Milestone

Comments

@Jozkee
Copy link
Member

Jozkee commented Jul 29, 2020

Background and Motivation

Even though we have covered the blocking issue of not being able to (de)serialize reference loops (#30820, #29900) by adding ReferenceHandler.Preserve to S.T.Json, there is many asks for adding an option equivalent to Json.NET's ReferenceLoopHandling.Ignore.

Motivations for doing this are:

  • ReferenceHandler.Preserve may be too cumbersome and increases payload size.
  • ReferenceHandler.Preserve creates JSON incomprehensible by serializers other than S.T.Json and Json.NET.

Proposed API

namespace System.Text.Json.Serialization
{
    public abstract partial class ReferenceHandler
    {   
        public static ReferenceHandler Preserve { get; }
+        public static ReferenceHandler IgnoreCycle { get; }
    }
}

Usage Examples

class Node
{
    public string Description { get; set; }
    public Node Next { get; set; }
}

void Test()
{
    var node = new Node { Description = "Node 1" };
    node.Next = node;
    
    var opts = new JsonSerializerOptions { ReferenceHandler = ReferenceHandler.IgnoreCycle };
    
    string json = JsonSerializer.Serialize(node, opts);
    Console.WriteLine(json); // Prints: {"Description":"Node 1"}. 
    // Note property "Next" being ignored due to cycle detection.

}

Alternative Designs

This new API is being added to ReferenceHandler class since this can be considered as an alternative to deal with references that is more isolated to the circularity problem during serialization.

Comparison with Newtonsoft.Json

void Test()
{
    var node = new Node { Description = "Node 1" };
    node.Next = node;
    
    var settings = new JsonSerializerSettings { ReferenceLoopHandling = ReferenceLoopHandling.Ignore };
    
    string json = JsonConvert.SerializeObject(node, settings);
    Console.WriteLine(json); // Prints: {"Description":"Node 1"}.
}

Comparison with existing ReferenceHandler.Preserve setting in S.T.Json

void Test()
{
    var node = new Node { Description = "Node 1" };
    node.Next = node;
    
    var opts = new JsonSerializerOptions { ReferenceHandler = ReferenceHandler.Preserve };
    
    string json = JsonSerializer.Serialize(node, opts);
    Console.WriteLine(json); // Prints: {"$id":"1","Description":"Node 1","Next":{"$ref":"1"}}. 
}

Risks

One downside is that users are uanble to implement their own ReferenceHandler and cannot make their own "Ignore Cycle" handler. The discrimination between preserve and ignore would occur with an internal flag.

Concerns of adding this feature (users must be aware of these problems when opting-in for it):

  • Silent loss of data given that the object where the loop is detected would be ignored from the JSON (see example: Should we add support to ignore cycles on serialization? #40099 (comment)).
  • Unable to round-trip data. the JSON may differ depending on the order of serialization (e.g: object properties and dictionary elements enumeration is non-deterministic).
@Jozkee Jozkee added this to the 5.0.0 milestone Jul 29, 2020
@Jozkee Jozkee self-assigned this Jul 29, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Jul 29, 2020
@Jozkee Jozkee added this to Backlog in System.Text.Json - 6.0 via automation Jul 29, 2020
@Jozkee
Copy link
Member Author

Jozkee commented Jul 29, 2020

From #30820 (comment):

I have to agree with many other concearning Ignore - and I'd like to try to add some arguments. Unfortunately video is not optimal for me, so I might have missed good arguments in the videos (but I've read through this issue/comments at least).

My arguments are based on that Preserve is the only good alternative. My reasoning for that: [JsonIgnore] is not feasible whenever you have a two-way relationship and you sometimes need to fetch either type of object (and you want the related objects with it, but not the self-references). Except for that there doesn't seem to be a good workaround?

  • Like @diegogarber pointed out, it breaks backwards compatibility and can induce significant additional work (e.g. frontend wise)
  • It causes a significantly larger payload. The original issues first two examples (Ignore vs Preserve) is pretty clear (~twice the amount of lines). For Web API scenarios (which might be the most common scenario for serializing JSON), that's pretty horrendous (even if the added stuff is quite short).
  • The JSON structure is not obvious from a frontend perspective - which the Ignore structure is (imo). For scenarios with Ignore where you really need the references, you could easily identify that specific call (again, assuming a Web API) and do your ViewModel-layer instead. The Preserve structure on the other hand needs parsing - specifically turning lists (arrays) into collections (objects) will be quite confusing.

To sum up: To break backwards compatibility, cause a significant larger payload, as well as breaking the structure by introducing a middle layer (as in Subordinates not being a list/array of values), should be reason enough to consider having Ignore.

I believe the StackOverflow issue mentioned reflects the community's stand on this - the only reasonable way forward (unless for specific scenarios where you really need the C#/serialization performance) is to continue with Newtonsoft. To me, that should seen as a sign that this is not a reasonable way forward. It's been a long wait between 3.0 -> 5.0 (we're still waiting after all), and to still not having this resolved will definitely continue to stir up frustration.

@Jozkee
Copy link
Member Author

Jozkee commented Jul 29, 2020

From #30820 (comment)

The problem with Ignore is that you end-up losing data, this will probably cause an issue even more severe than having a large payload or having to do extra parsing in the front-end.

e.g:

static void Main()
{
    var angela = new Employee { Name = "Angela" };
    var bob = new Employee { Name = "Bob", Manager = angela };

    angela.Subordinates = new List<Employee> { bob };

    var settings = new JsonSerializerSettings { ReferenceLoopHandling = ReferenceLoopHandling.Ignore };
    string json = JsonConvert.SerializeObject(bob, settings);
    Console.WriteLine(json);
    // {"Name":"Bob","Manager":{"Name":"Angela","Manager":null,"Subordinates":[]},"Subordinates":null}

    Employee bobCopy = JsonConvert.DeserializeObject<Employee>(json);
    Employee angelaCopy = bobCopy.Manager;

    Console.WriteLine("Number of subordinates for angela: " + angela.Subordinates.Count);
    // Number of subordinates for angela: 1
    Console.WriteLine("Number of subordinates for angelaCopy: " + angelaCopy.Subordinates.Count);
    // Number of subordinates for angelaCopy: 0
    Console.WriteLine(angelaCopy.Subordinates.Count == angela.Subordinates.Count);
    // False
}

@cjblomqvist How is that similar scenarios are not a concern for you or everyone else asking for Ignore?

@Jozkee
Copy link
Member Author

Jozkee commented Jul 29, 2020

From #30820 (comment)

@Jozkee thanks for your reply!

I believe a scenario which is probably quite common is that you have let's say 100 types, with various relationships between them. Some contian many, some not so many. Basically all are related to at least one other type one way or another, and you have a REST-like Web API to get them from a Web/Mobile application. To map out all relationships in each scenario and make sure you handle all self-referencing scenarios is quite cumbersome. So what happens is, the user is asking for a main type A, and then you ensure you also get all the related types that you believe the "user" might need, such as B, C, D, E, F, G (E, F, G might be sub-relationships). You do not filter this properly (i.e. you do not project out properties not needed) because of not wanting to "waste" time on it (as well as laziness). To map out this tree:

A
A -> B
A -> C
A -> D
A -> B -> E
A -> B -> E -> F
A -> D -> G

So far, all good. The problem is when you also have the following relationships:

C -> A
E -> A
D -> A
G -> A

It's cumbersome to properly keep track of them and filter them out (e.g. through JsonIgnore - which also has the problem of filtering out the relationships in all scenarios, not only for this particular call for A). It would also be cumbersome to handle it using preserve due to the A) wasted space/size to transmit over the network (might be less of a concearn in above scenario since obviously we're not so picky with what we're including and not), and B) the data structure does not map in a fully obvious way to JSON, but with a quite new-JSON-library-in-.NET specific way.

Then, to finally answer your question: The "user" (frontend consumer) simply does not care about the additional relationships [C, E, D, G] -> A - it only wants to know the original tree. The loss of data is irrelevant - because the relationship properties are not relevant.

I'm not say that Ignore is perfect by any means, but it is darn convenient to simply add it to a project and then not having to think about it anymore. Something like Preserve could be very useful, but the current implementation is too cumbersome for most cases (with all the downsides listed above) - it doesn't fulfil that simple-and-convenient-albeit-not-100%-correct scenario. I do understand that there are reasons behind Preserve being the way it is, so I'm not saying the Preserve is wrong. What I'm saying is that Preserve is good for one thing (100% correctness while still handling self-referencing), while Ignore is good for another (simple/convenient, but not 100% correct in some cases).

I do believe a lot of people in the community feel the same.

Something in between Ignore and Preserve might be very useful, especially that doesn't break the assumed data structure (to avoid the frontend parsing/unexpected data structure for lists/arrays) while still not loosing data, but I understand that there are difficulties/complexities with this, so Ignore might be good enough to solve the relevant actual real world scenarios.

@Jozkee Jozkee removed the untriaged New issue has not been triaged by the area owner label Jul 29, 2020
@danmoseley
Copy link
Member

@Jozkee is this required to ship 5.0 (per milestone set here)?

@Jozkee
Copy link
Member Author

Jozkee commented Jul 30, 2020

@danmosemsft I think it could disappoint many users if we don't ship this on 5.0.

@danmoseley
Copy link
Member

@Jozkee maybe, but master is a 6.0 branch in less than 3 weeks: if we add features, we're implicitly deciding to leave bugs un-fixed.

That's a general statement - I don't have context on this issue or JSON in general so it would be a good discussion to have with @ericstj .

@Jozkee Jozkee removed this from Backlog in System.Text.Json - 6.0 Jul 30, 2020
@Jozkee Jozkee modified the milestones: 5.0.0, 6.0.0 Jul 30, 2020
@ericsampson
Copy link

@Jozkee @ericstj FWIW when I heard that STJ would have loop handling in 5.0, I just assumed that would include something along the lines of Ignore. It would be disappointing to not have something along those lines in 5.0
Also FWIW I don't love the Newtonsoft options. Like another commenter mention above, it would be great to have an in-between option that didn't suffer the data loss of Ignore; one option that seems great space-wise would be to only emit reference tags where they are needed, although that might require backtracking/rewriting?

@snow-jallen
Copy link

What's the verdict? Is this going to ship in .net 5?

@ericstj
Copy link
Member

ericstj commented Sep 30, 2020

As per the milestone this was set to 6.0.0. The feedback around the importance of this came in after feature complete for 5.0 and didn't meet the bar. We will aim to add support here in 6.0.

@cjblomqvist
Copy link

cjblomqvist commented Oct 1, 2020

@ericstj - even though I can understand it's difficult to prioritize - everyone of course wants to have their piece in every release (which is not possible if wanting to meet the deadlines). I do want to set the record straight though, because unless I'm missing something

the feedback around the importance of this came in after feature compelte for 5.0

doesn't ring very true for me.

First off, I'll assume everybody here knows about the significant outcry that came after releasing 3.0 a year ago about not having feature parity and easy compatibility on this critical feature (JSON handling). Note that it's not only about feature parity, but also about compatibility. Like expressed later, it's simply not fully complete to do a different solution with significant drawbacks, causing a lot of rework for anyone with a web app sending JSON relying on EF with any self-referencing loops anywhere (should be one of the most common applications out there with .NET Core?). Ok, so it was quite clear from the beginning - but of course, feedback about the specific solution didn't come from the start (it didn't exist, so not possible).

According to your releases, you reached "feature complete" with Preview 8, released on Aug 25th (https://devblogs.microsoft.com/dotnet/announcing-net-5-0-preview-8/). You were not "feature complete" (almost, but not fully), with Preview 7 (see first section of https://devblogs.microsoft.com/dotnet/announcing-net-5-0-preview-7/) on Jul 21st.

Ok. Jozkee seems to have announced his plan/proposal on May 5th (#30820 (comment)). The first comment about ignore came 6 days later on May 11th (#30820 (comment)). No reply. Then after that another person commented on Jul 14th quite strongly and imo clearly the issue (#30820 (comment)). No reply. Then I have a try at it in perhaps a more structured approach on Jul 27th (#30820 (comment)), which finally ends up with this issue right here after a quick back-n-forth with Jozkee.

So, unless I'm very much missing something, it's simply not true that

the feedback around the importance of this came in after feature compelte for 5.0

To quote diegogarber (again, from Jul 14th)

@Jozkee can we PLEASE get the Ignore???
A simple scenario:
Let's say that you have your application that's using newtonsoft.json (as most people do) and you swap to system.text.json. What happens? well, you get loop errors. You swap it to preserve and now your front end does NOT WORK!!!!!

To me, that's pretty clear.

If you want to push it to 6.0.0 at this point in time I understand (since we're so close to release). I understand it's challenging to do such great work in an open environment with the community involved. But please, do not blame the lateness of the feedback from the community.

@ericstj
Copy link
Member

ericstj commented Oct 1, 2020

Sorry @cjblomqvist I didn't mean to blame anyone here. I was stating the facts about the decision made in this issue with its specific dates. On July 15 we branched dotnet/runtime and started stabilizing for Preview8.

Note that it's not like this aspect of the feature wasn't discussed in the design review. My understanding was that there was push back in the design review against the option of letting callers of the serialization API silently drop members from types which they don't own through the cycle-ignore option. @Jozkee calls this out at the top of this issue and it's something that will likely still need to be discussed when bringing this change in to 6.0. We are hearing the community that it's important to reconsider this decision and will get to it. We don't want to rush that decision. The new API needs to be designed and approved. Even once this is approved it will need testing to ensure it meets folks' needs, provides adequate performance, and meets security guarantees. We don't like to rush this sort of thing due to the high-compatibility bar we hold for components in dotnet/runtime, thus our early lock-downs.

I truly apologize that this new 5.0 feature is not meeting your needs, and I hope we can improve it in 6.0.

@ericsampson
Copy link

@ericstj thanks for keeping the dialog open.
I'm not sure of a good way to have the following considered during the design discussions @Jozkee, but I'd appreciate considering offering two new options:

  1. option equivalent to Newtonsoft 'ignore', for compat/upgrade scenarios.
  2. a new option (PreserveSlim? PreserveMinimal?) that only emits reference tags where they are needed, vs Preserve that emits them everywhere. I understand that this might be slower, but would result in a lot less going over the wire and also a friendlier format for human consumption. It seems like a reasonable option between Ignore and Preserve to allow people to choose.

Thanks!!

@cjblomqvist
Copy link

Thanks for your reply @ericstj - communication is definitely important and something that easily gets wrong in written form over the Internet.

I do understand consideration needs to be taken, and it's good you added insight about the branching point for Preview 8 in July 15th. I guess the overall negativity on this stems from the overall disappointment of not being able to migrate to this for 2 years (3.0 -> 6.0), for what looks like (probably wrongly) a quite small feature.

Isn't it ironic that one of the major hurdles for getting this implemented is the implications of needing to keep backwards compatibility on this, when basically the whole issue stems from exactly backwards compatibility with Newtonsoft.... :)

@hairlesshobo
Copy link

hairlesshobo commented Oct 2, 2020

Quote from #30820 (comment)

Regarding ReferenceHandling.Ignore....

I don't see why this wouldn't be useful. Sure, on deserialisation it could throw up some odd results, but I think for serialisation it's actually quitehelpful. The example use case in the very first comment is a perfect example of why it would be useful. I think it would be ideal to have something like this, because attributes like [JsonIgnore] don't really help, as that would just stop that property from serialising ever.

Currently people are suggesting actively dumping System.text.Json, as can be seen in this question on Stack, where 3 of the 4 answers suggest using Newtonsoft.Json, which has the ignore functionality built in. So people are actively gimping the performance of their application to get a fairly basic feature working. Newtonsoft is slow, but at least it works.

EDIT: For reference, I've been using preview 3, I'm yet to try preview 4 but if preserve achieves essentially the same thing without throwing the cycling exception then that would probably be good.

I just want to throw in my two cents on this one. I decided to go down the route of porting my application from Newtonsoft.Json for the sake of performance. I mean, after all, that is one of the huge selling points of System.Text.Json.. I invested hours converting everything over only to find out at the end when I was thoroughly testing everything that I couldn't serialize objects that were pulled by EF core. Talk about a surprise to me!! I even tried the ReferenceHandler.Preserve functionality of 5.0 rc1 and serialization still bombs out due to the circular references.

Anyways, the solution I had to come up with was to use System.Text.Json for the input formatter for MVC because it can handle data coming in just fine, but stick with the slower Newtonsoft.Json as the output formatter, just so I can serialize data from EF core. NOT the ideal solution, but hopefully I will at least get performance benefits on POST/PUT methods.

In short, come on, pushing ReferenceHandler.Ignore to 6.0? I feel this is a pretty basic need and something that, from what I have read, many others just assumed would be available, for the sake of feature parity. Especially since EF core and System.Text.Json are both Microsoft projects, it would seem reasonable that the two could easily be used together.

Please consider adding this to a minor update to 5.0 instead of making us all wait another year to be able to use this new JSON serializer!! Please...

@butler1233
Copy link

As per the milestone this was set to 6.0.0. The feedback around the importance of this came in after feature complete for 5.0 and didn't meet the bar. We will aim to add support here in 6.0.

I'm sorry but I just straight up disagree with this. @Jozkee highlighted in his initial post on #30820 a perfectly valid use case for .Ignore (which for some reason went unnoticed apparently) back in September 2019, over a year ago!

Discussion ensued in that issue for a while, and for some reason at some point it was decided that ingore wasn't required, despite many community contributions specifically requesting it. Eventually I too commented (#30820 (comment)) requesting it, and that was in May. Many of the comments above are responses which came after then.

There was clear issue with the lack of Ignore for months, and multiple people in the community have been unable to experience the improvements that come with STJ because of this catastrophically broken part of the "mostly drop in replacement" for Newtonsoft which requires either introducing a load of additional parsing (if using Preserve and all the extra stuff that comes with it). I've been asked numerous times what i think about STJ and I've had to say that although overall it is a significant improvement over Newtonsoft, if there's any reliance on newtonsoft's Ignore then it's totally unusable, so the only real option is to nerf the applications which would otherwise use it and use newtonsoft instead.

@cjblomqvist's comment from July (which was already copied above) is an excellent rationale for why we needed this.

Realistically, if .NET 5 is "the future", shooting everyone who uses ignore in the foot is a terrible starting position. It should have been worked in months ago (well, it should have been there from the start), and putting it off until 6 is absolutely ridiculous. @hairlesshobo is just one of many who have been frustrated by this.

Ranting aside, I understand the rationale for not having it, as maybe it shouldn't have been there in the first place. Unfortunately that error was made many years ago, and in the interests of backwards compatibility the code which enables aome "bad" practices should be in place.

That said, in what is probably most instances where this is a problem (where people are serialising EF objects), the "lost" data isn't relevant, as it's usually a child referring to its parent anyway, which referred to its child in the first place.

@ericstj
Copy link
Member

ericstj commented Oct 5, 2020

A bit more on schedule so folks understand where we are at now. We switched gears from feature development to bug-fixing around mid-July when we branched for Preview8. At that point it was all-hands on deck to get bugs fixed and bring 5.0 up to ship quality. We haven't been doing feature work since then, and new APIs and design changes are feature work (more on that below). We're now pretty much done with bug-fixing for 5.0 and have a chance to start looking at features again.

one of the major hurdles for getting this implemented is the implications of needing to keep backwards compatibility on this, when basically the whole issue stems from exactly backwards compatibility with Newtonsoft

System.Text.Json does not aim to be compatible with Newtonsoft.Json. We're specifically trying to be different to provide a different value proposition. Those differences are actually what drives folks to consider using System.Text.Json instead. We are trying to make System.Text.Json work for as many folks as possible, but ease of adoption is different than compatibility. I don't see a backwards compatibility problem with adding this feature, it should be behind an opt-in flag. I think the main hurdle for this feature was/is design principles.

(which for some reason went unnoticed apparently) back in September 2019, over a year ago!

I respectfully disagree. I see plenty of discussion from @Jozkee @ahsonkhan @steveharter and @JamesNK that acknowledges the scenario. It was also discussed in the API review. There was a good understanding of ReferenceLoopHandling.Ignore and what it's purpose was. There's no conspiracy to suppress this feature. It was an intentional design decision. Sometimes those are even harder to undo, since you need to revisit a decision that was already made, but that is exactly what we're doing here. Please help us make the right call this time around.

If it were me trying to present the case for this feature I'd share things like:

  1. Cost or in-viability of the JsonIgnore workaround. (I already see some of this, thank you @cjblomqvist)
  2. Usage numbers of the Newtonsoft Ignore feature.
  3. Precedent in other serializers (other than Newtonsoft.Json) that support this or some similar behavior. Can be .NET, browsers, etc. Anything to establish precedent for this functionality.
  4. Precedent in the existing System.Text.Json API that permits a caller to modify the contract of a type which they don't own (EG: you can use a custom converter to do this today)
  5. Precedent in the existing System.Text.Json where an option will prevent round-tripping or silent lossy serialization.

I bet this type of data would help the API reviewers better understand the tradeoffs when making this decision. I am not asking the community to provide all this, but just sharing some suggestions for how this issue should proceed.

Next steps are an API proposal and a review. Assuming that goes through, the feature should be implemented in the main branch codebase (6.0). Once that's done, we can talk about porting it to the servicing branch consumable via the package, once we see what the final design looks like and understand the risk. I can make no promises as we typically do not permit API additions in servicing, but since this also ships as a NuGet package we can technically make it work. I suspect the viability of a servicing release here depends on risk.

Thanks everyone.

@butler1233
Copy link

Thanks Eric for your input.

I just want to clarify that I wasn't just railing against the project/maintainers/contributors, but was trying to establish the view for me (and those who I've spoke to in favour of Ignore) about the situation.

I personally am not even a fan of the current implementation that exists in Json.NET. While I do make use of it, and agree with @JamesNK that it definitely shouldn't be the default because it could in theory cause data to be lost, I also think that there is some way of implementing behaviour similar to what the other implementation did, while maybe doing it more "safely".

I wouldn't want to be the one putting forward the case for the API myself because although I have a few years under my belt, I think things like this might be a bit above me.

I think the core issue with the [JsonIgnore] workaround, as @cjblomqvist has already shown, is that it's not quite the desired result in (what I believe to be) the main "selling point" of the feature, which is the ability to serialise EntityFramework objects (with navigation properties), as it ruins the ability to serialise that property in any scenario, like he said.

There are elements of the Preserve that I like, although I could see incompatibilities between producers and consumers if they aren't aware of the same metadata properties and what they mean. However in instances where parsing is the same on either end, I do like how it's fully reference aware.

There are a couple of rough ideas off the top of my head for approaches to this.

  1. Add the ignore in ReferenceHandling - Restores compatibility with people porting from Newtonsoft, so it's entirely transparent to consumers. Does allow data to be "lost" though.
  2. Add some way of enabling reference loops to be ignored in a "one time" type way - This option would restore compatibility with consumers on calls which could have reference loops, acknowledged by the need to explicitly allow on each call to serialise that references may be dropped from the payload. I personally think this would be the best for users (although porting might be a pain on the producer side), but would most likely be a breaking and/or somewhat major change than other options for the current APIs
  3. Do nothing - While this works as Preserve already seems to do the job for users who currently do this with a metadata-aware Json deserialiser on each end, it may disappoint some users who use the Json data for cross compatibility, as consumers on other platforms may be spooked by the metadata.

I think 2 is the best option because while I think that there is usually no reason to have something like this, for the example @cjblomqvist had, where it's for display purposes only and the data isn't expected to be saved back anywhere after deserialisation, it's not the worst plan. I think the user can understand that in the ignore example on the original #30820 (comment), Angela would obviously be a subordinate to her manager.

I suppose as always, it depends, as in that scenario it's clear from context. I think Ignore should definitely exist, but maybe more as an "exception to the rule" type deal, and maybe not something that can be set globally like it can in Json.NET.

Hope that all makes sense. :)

@steveharter
Copy link
Member

The main scenario as I understand:

  • EF is being used to obtain entities that need to be serialized to JSON.
  • The entities are serialized to JSON which is used by a client or UI that doesn't want to repeat the duplicated information.
  • The JSON that is missing is not important, and it's OK to have silent-data-loss.

The main question I have is whether the POCO types are general-purpose meaning used for both the scenario above as well as for other data-interchange scenarios.

I assume the POCOs are general-purpose based on the feedback that [JsonIgnore] can't be used on navigation properties since sometimes the POCOs should be serialized with references (probably using "Preserve" feature). For this to work, I assume there would be a JsonSerializerOptions instance specific for the client\UI scenario above that would use the new "Ignore" feature and another JsonSerializerOptions instance used for data-interchange scenarios that would probably use the "Preserve" option.

So as a potential work-around, would using DTOs specific for this scenario (that would not have the properties that should be ignored) work? One could argue that using DTOs in this manner is a best practice. The counter-argument is that this is cumbersome and\or the existing Newtonsoft semantics are perfectly fine.

Assuming DTOs are not the answer, for 6.0 I believe we can tweak the design of ReferenceResolver. The reference handling feature was designed to be extensible by creating a class deriving from ReferenceResolver and specifying that instance on JsonSerializerOptions. An updated design of that that understands "ignoring" would make it possible to implement a resolver that does exactly what Newtonsoft does, or some other implementation such as a deterministic design that looks for a new custom attribute like [IgnoreForUI] on properties plus a way to set that UI mode on the options.

Also, for 6.0, STJ should consider addressing the deterministic issues that have a workaround in Newtonsoft. See #1085.

@Jozkee Jozkee added api-ready-for-review API is ready for review, it is NOT ready for implementation api-suggestion Early API idea and discussion, it is NOT ready for implementation labels Dec 17, 2020
@terrajobst
Copy link
Member

terrajobst commented Jan 5, 2021

Video

  • The concern would be ensure that the output is deterministic regardless how we're traversing the object graph. Is this guaranteed? If that's the case, this feature seems reasonable.
  • When an a list contains a cycle, we should write out null value, instead of just omitting the value (as that's consistent with the ignore-null-behavior and doesn't change indices).
namespace System.Text.Json.Serialization
{
    public partial class ReferenceHandler
    {   
        // Existing:
        // public static ReferenceHandler Preserve { get; }
        public static ReferenceHandler IgnoreCycle { get; }
    }
}

@terrajobst terrajobst added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Jan 5, 2021
@ericsampson
Copy link

@terrajobst FWIW here's our usecase for this functionality, just because there was a number of questions about this on the video:

"So we have a central structured logging service where users can send arbitrary C# objects, and then our service writes them to several targets (Elasticsearch, APM, etc) and these targets often want the payload to be JSON. So we need to be able to transform arbitrary user-created objects whose structure is out of our control into JSON. It is not uncommon at all for these objects to contain circular references (e.g. due to Entity Framework). In order to prevent exceptions when serializing these arbitrary user-created objects to JSON, we use Ignore with Newtonsoft and hence have not been able to switch these services to STJ.

In this use case, the inability of Ignore to round-trip (which has been mentioned upthread as a reason to not offer this behavior) is a complete non-issue. We just need to be able to serialize the object into something human-readable in all cases without the serialization blowing up, even if there is some information loss - because the end consumer of the JSON is human eyeballs in Elastic/APM/etc. It's better for us to get some information to the user in the cases where they passed circular objects, than having to drop the user's records entirely."

@jeffhandley jeffhandley added the Priority:2 Work that is important, but not critical for the release label Jan 14, 2021
@steveharter
Copy link
Member

This feature seems specific to a handful of related but somewhat different scenarios. It may be safer and more flexible to improve the existing extensibility model to support ignore and provide various samples around that.

The concern would be ensure that the output is deterministic regardless how we're traversing the object graph. Is this guaranteed? If that's the case, this feature seems reasonable.

I was under the assumption that the behavior could be non-deterministic in some cases, but that doesn't appear to be the case at least with Newtonsoft. Ignore only applies to the exact moment a cycle is detected. Thus:

  • A given object can be serialized more than once.
  • Cycles are broken by detecting when a child object is a reference to a parent (which is already being serialized).

with the result being a cyclic graph is converted into a noncyclic graph, assuming a root of course.

Here's a Newtonsoft test of a A, B and C nodes where they all reference each other with A being the root.

        static void Main(string[] args)
        {
            // Uncommenting the line below changes reflection order.
            // typeof(Node).GetProperty("Ref2");

            var a = new Node { Name = "a" };
            var b = new Node { Name = "b" };
            var c = new Node { Name = "c" };

            a.Ref1 = b;
            a.Ref2 = c;

            b.Ref1 = a;
            b.Ref2 = c;

            c.Ref1 = a;
            c.Ref2 = b;

            string json = JsonConvert.SerializeObject(a, Formatting.Indented, new JsonSerializerSettings
            {
                ReferenceLoopHandling = ReferenceLoopHandling.Ignore,
            });

            Console.WriteLine(json);
        }
    }

    public class Node
    {
        public string Name { get; set; }
        public Node Ref1 { get; set; }
        public Node Ref2 { get; set; }
    }

Output:

{
  "Name": "a",
  "Ref1": {
    "Name": "b",
    "Ref2": {
      "Name": "c"
    }
  },
  "Ref2": {
    "Name": "c",
    "Ref2": {
      "Name": "b"
    }
  }
}

B and C are serialized twice, but the root A is only serialized once. So B->A and C->A are ignored since in those cases the "child" node A was already serialized as a parent.

If the reflection order changes (uncomment the line that mentions this) the data is still the same, just in different ordering:

{
  "Ref2": {
    "Ref2": {
      "Name": "b"
    },
    "Name": "c"
  },
  "Name": "a",
  "Ref1": {
    "Ref2": {
      "Name": "c"
    },
    "Name": "b"
  }
}

@steveharter
Copy link
Member

I assume the design will cover:

  • Value type semantics (do we use Equals?)
  • Reference type semantics on whether Equals() vs. ReferenceEquals() is used.
    • I assume we want ReferenceEquals() since that is what we use for preserve reference handling.

@Jozkee Jozkee added the Cost:S Work that requires one engineer up to 1 week label Jan 14, 2021
@ericstj ericstj added Priority:1 Work that is critical for the release, but we could probably ship without and removed Priority:2 Work that is important, but not critical for the release labels Jan 20, 2021
@layomia layomia added this to Targeting preview 2 - 03/11 in System.Text.Json - 6.0 Jan 26, 2021
@layomia layomia removed this from Targeting preview 2 - 03/11 in System.Text.Json - 6.0 Jan 27, 2021
@Jozkee
Copy link
Member Author

Jozkee commented Feb 1, 2021

Sending back to api-ready-for-review in order to discuss a more suitable name for the API:

1. IgnoreCycle: Original proposal.
2. IgnoreCycles: From #46101 (comment) one could read Preserve as PreserveDuplicates, which would be plural too.
3. BreakCycles: Given that we are breaking cycles by emitting the JSON null token and to be clear about the behavior diverging from Newtonsoft.
4. EmitNullForCycles: From #40099 (comment).

@Jozkee Jozkee added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-approved API was approved in API review, it can be implemented labels Feb 1, 2021
@ericsampson
Copy link

What about something like

EmitNullForCycles

@Jozkee
Copy link
Member Author

Jozkee commented Feb 1, 2021

@ericsampson sounds like a reasonable name that exactly describes what the API does.

@bartonjs
Copy link
Member

bartonjs commented Feb 4, 2021

Changed to plural via email:

namespace System.Text.Json.Serialization
{
    public partial class ReferenceHandler
    {   
        // Existing:
        // public static ReferenceHandler Preserve { get; }
        public static ReferenceHandler IgnoreCycles { get; }
    }
}

@bartonjs bartonjs added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation api-suggestion Early API idea and discussion, it is NOT ready for implementation labels Feb 4, 2021
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Feb 18, 2021
@dotnet dotnet locked as resolved and limited conversation to collaborators Mar 20, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Text.Json Cost:S Work that requires one engineer up to 1 week feature-request Priority:1 Work that is critical for the release, but we could probably ship without
Projects
None yet
Development

Successfully merging a pull request may close this issue.