-
Notifications
You must be signed in to change notification settings - Fork 700
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement codegen configuration for field merging #2560
Comments
Thanks for the report @pm-dev!
It's likely that your operations are hitting some sort of edge case that is causing some serious issues here.
This definitely should not be the case. The new codegen is expected to be 10x smaller than the old codegen actually. It's likely that you are doing some very complex fragment merging. We should be able to handle this, and all of our tests thus far have shown positive results. As far as the size of generated code exploding this is likely the cause, and I already intend to fix that with #2191 (coming as soon as possible!).
But it should not be hanging during codegen... that's for sure. If you can send us some examples to debug this with, that would be incredible. Please reach out to me at anthony@apollographql.com and we can talk about how to do that. |
Thanks for the reply. I've sent over a project which will allow you to repro. |
Received! I'll look into this today and update you ASAP! I'll also ping you if I need any other info. |
@pm-dev Just wanted to give you an update. Profiling with your schema and queries has been really helpful. None of the tests we've run have had so many queries with so many nested fragments. The nested fragments are causing a lot of interesting problems. We've had teams at large companies testing the code gen, but none of them had the amount of nested fragments that you do right now. As far as memory goes, we've got a few memory leaks, but its more so that the memoization of processing nested fragments is going crazy with your operations. For the processing time... that's an even harder one to solve. Because of the nested fragments, we are doing a lot of calculations for the merging of fields. We've done a lot to optimize that (see memoization above for one example, haha). But it's just a lot to process. I was concerned that this might be an issue for some users, but we've had some pretty big projects run codegen just fine. Yours is the first that is causing this to happen. I'm going to continue to dig into this and see if I can find some clever ways to make this better. Because of the merging of fragment fields, the generated code is getting pretty huge. The concept of "hoisted types" that I hope to implement in #2191 will recognize when we can re-use an already generated type because it will have the same merged fields. I'm not sure if that will cause running the code gen to go slower or faster. I'm going to explore that soon. We have a couple of high-priority bugs that are blocking a lot of other users who don't have the complex nested fragments that you have. (This situation I expect to be relatively rare). So we are going to work on getting those out and a 1.0.1 patch version released to un-block people. I will come back to this and resolve all these issues as soon as we are done there. I'll let you know when I have more to report. |
@AnthonyMDev Thanks for the update! The prioritization of items for the 1.0.1 patch makes total sense. When you get back to this let me know how I can help. I'm a little curious the difference in "algorithm" vs the previous js codegen so in the meantime I'll get caught up on the difference in approach for v1 |
I don't really know that this is an anti-pattern. It makes sense why you are using them for your use case, and for many others honestly. The biggest difference here is the way we are merging fragment fields into the generated operation objects. We wanted to make it easier for you to access fragment fields from the generated models without having to convert back and forth from operation model <-> fragment when you wanted to access fields from other fragments. This improves ease of use, but does generate more field accessors on each model. Because we made the generated code for each field more concise (1 line per field instead of 5), this reduces generated code size in most cases. But in your case, the number of merged fields is exploding the generated code, causing massive processing times and massive generated files. We're working on improvements to this, but in 1.1, it's likely we also give you the option to just turn off fragment field merging. You'll only be able to access the fields from a given fragment by converting to that fragment, but your generated code will be more reasonably sized. |
Ah this makes sense, we're currently not merging fragment fields. Although merging in fields makes working with the data types more fluid, for our use-case where we are immediately translating GQL types into a different data model (and thus only accessing each field in one place), it's not that big of a deal. |
Yeah, one of the primary goals here was that you wouldn't need to translate the generated models into your own data models if they are easier to use! But if you are translating to REST models, this makes perfect sense. Once I add in that functionality, you'll have considerably smaller models. ;) Will be getting to it ASAP. I'm am on vacation at the end of this month, so I'm going to try to get there before I leave, otherwise it will be done sometime in November... |
+1 for this from @jimisaacs in #2625 |
+1 from me, this is currently stopping us from migrating to 1.0 |
Curious what the plan is to add this option back or any ETA |
This is planned to be one of the next large stories I take on! We have been fixing a bunch of smaller blockers, and then the holidays hit. There are a few large-ish features I want to add in versions 1.1-1.3, this is #3 on that list. 1.1: Generated operation model creation
1.2: Reduce generated schema types
1.3: Improve fragment merging and code generation performance
|
Hey there all, in response to the comment @pm-dev made here on our roadmap updates, I wanted to give you all an update on this issue. First off, thank you all for the feedback! This feature does keep moving down the list, and I'm disappointed in that as well. I can promise you that all of our decisions about prioritization of work are very heavily impacted by community feedback, so it's really valuable to get this signal from all of you that the feature is important to you. I know that a number of users have been waiting on this for a long time. The reason we've reprioritized was that the volume of requests for selection set initializers was noticeably higher. The work for selection set initializers ended up requiring a significant architectural change to our underlying data models, and it's exposed some bugs and limitations we've been working through. We're just trying to do what we can to positively impact and unblock the most people with the time and engineering resources we have available. I am genuinely invested in getting this feature off the ground as soon as we can. |
Thanks for the update and context @AnthonyMDev! |
Now that 1.3 is out. It now looks like the roadmap is out of date again, and this is going to slip. At this point I have no expectations of seeing this fixed this year and it’s really disheartening. |
Would also like to understand the new date for this, as the last time I communicated this to my team, it was July, 1.3. |
Wanted to get another status update on this. |
@AnthonyMDev this is becoming a little unmanageable, and I would like to hear about progress or a plan on this issue. I'm looking at a local changeset where I've added 16 lines of GraphQL to a fragment. Then, I see around 50k lines of generated Swift when I run the codegen process as the result of those 16 lines. |
Sorry I missed your comment last week! I totally understand your pain. It's shared by a lot of other people. This has been a lot more complicated than I ever expected it to be, and other things have come up and taken priority. I plan to dive back into this on Monday actually. We've already gotten most of the work done to enable this, so I'm anticipating finishing it quickly. I am truly sorry to everyone who has been waiting on this feature. It IS coming soon. |
Update: I think I've got this working! There are probably a number of edge cases that I haven’t solved yet, but they are difficult to identify. I’m going to spend the next few days writing a ton of unit tests for this and hammering away at any bugs. But the branch for PR #318 should be mostly working now. If anyone wants to try testing against that branch and give me any feedback on bugs, that would be really appreciated! |
I made a very silly oversight yesterday. While the field merging strategies work properly, they are not actually affecting the generated templates yet. I'm going to get that sorted out today and will have another PR up with this working soon! |
I tried it out. It works as expected. Thanks @AnthonyMDev. Excited to have these changes be merged |
Glad to hear it's working for you @tahirmt! I've already found a few edge cases in which it's generating code that doesn't compile (because it expects types to exist that are no longer being generated). So we won't be releasing this publicly until I can get all the bugs worked out. But's it's good progress! |
To be fair I only checked that the code was generated. I haven’t had the time to compile and test fully yet. Edit: Yeah I can see that it doesn't compile. I'd be happy to test changes as you progress. We have a very large schema and hundreds of fragments so a pretty large data set. |
Hey all, I wanted to give an update. We've hit a problem here and would like to tell you how we plan to proceed and request your feedback. The ProblemAfter building this feature out, we've realized that it causes errors with generating selection set initializers that are really not fixable. Generated initializers for the selection set models requires you to compute and generate all of the selection sets that merged fields generated. Because in order to initialize valid objects that you can write into the cache, you need to have all of the merged fields for the underlying This isn't really a limitation of the system as built. It's a fundamental incompatibility. It doesn't make sense to generate objects that give only a partial view of the full object for the operation and then use that object as if it's a fully complete entity. There isn't really a work around for this or some way to make this work in the future. Planned Approach - Experimental Feature w/ validationWe want to get something out for those who are blocked by this issue to use ASAP, so while it's not perfect, we're going to start with shipping the field merging configuration as an This is not ideal, because it's not clear to users why this limitation exists and it feels bad to be able to build a configuration that cannot function properly. We want to improve the structure of the Feedback WantedAre you using |
From our team while we do have |
We do not use |
We do use selection set initializers in tests but could switch to passing JSONObject. fwiw I just retried codegen again on my project with the latest version, wondering if this perf improvement fixed the bottleneck. It did not. So, still waiting on this fix in order to update to v1. |
@pm-dev Yeah, I actually ran a test after that PR against the old example you had sent me. It did help, but it's not enough. When I finish this, you should be able to get it working though. That perf improvement helped me to diagnose where the bottleneck is though in your large schema. I have a few ideas for some other algorithmic improvements that could unblock this. But even if we got it processing, the way your operations/fragments are set up, you would end up with massively oversized generated models. You're going to want to turn a lot of field merging off anyways. |
We use This is currently a big problem for us because we've identified that the actual binary size is bloated due to us moving from the prior version of Apollo to 1.x back at the start of the year. We used to have fields un-merged (forcing access through Fragments) and now that option is gone. We saw a near 20mb increase in our app size that we didn't notice until recently. |
@Mordil Yes, the increase in binary size is the primary reason people want the turn off merging. Unfortunately, there isn't a way to have both Once this feature is shipped, if you want to disable field merging, you're going to need to either initialize using raw JSON directly (unsafely) or use the Test Mocks. I have been considering a future iteration of this where we allow field merging to be enabled for fragments, but disabled for operations, which could allow you to use initializers on your fragments, but it's still unclear to me if this would really allow for the flexibility needed for the "cache manipulation shenanigans" you referred to. |
@AnthonyMDev Allowing it on the fragments but not on the operations would work for us |
Bug report
My team was excited to upgrade to the 1.0 release (thanks for all the hard work), unfortunately we've been unable to get codegen to work. Pre-v1.0 we used
ApolloCLI
fromApolloCodegenLib
to shell out to the js codegen tool and it would take ~5 seconds to complete. After updating to 1.0 and swift-native codegen, we're able to compile and run, but codegen never finishes. It's able to find our schema and .graphql queries, and about half our queries (20 or so) are generated immediately, but then codegen starts to take minutes per query and the memory the process is using grows to multiple GBs.Versions
Please fill in the versions you're currently using:
apollo-ios
SDK version: 1.0Steps to reproduce
I would gladly provide our schema and queries if there was a confidential way to share those files so we could debug this together.
I've tried playing around with the configuration a bit to see if any of these options make a difference, but no luck. Here's an example of what it looks like:
I'll do a little more profiling and can update this thread with anything I find. I've already noticed the script spends a lot of time in the
mergeIn
function. Also the generated files are on average have 10x more lines than previously (query strings have always been multi-line so that's likely not the difference) is that expected?The text was updated successfully, but these errors were encountered: