Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Lazy load routers #4129

Open
1 task done
zomars opened this issue Apr 3, 2023 · 12 comments · May be fixed by #5489
Open
1 task done

feat: Lazy load routers #4129

zomars opened this issue Apr 3, 2023 · 12 comments · May be fixed by #5489

Comments

@zomars
Copy link
Contributor

zomars commented Apr 3, 2023

Describe the feature you'd like to request

Basically next/dynamic but for the backend.

Recently our tRPC router has been growing significantly for many reasons but mostly due to heavy third party SDKs. The problem with this is that even if we're not calling that specific procedure we still load ALL the router dependencies when calling any procedure.

This impacts performance and cold boots significantly.

Describe the solution you'd like to see

Splitting big tRPC routers into lazy loadable chunks could help loading only what is needed when a procedure is called. I would imagine some pseudo code like this:

image

Describe alternate solutions

Other aternatives involve moving all the procedure code into a separate file a use imports for that as done in here.

Also lazy loading only third party libraries (like Stripe, Google, etc.) seems to help. But it would be a much nicer DX being able to simply lazy load full routers and support it natively.

Additional information

No response

👨‍👧‍👦 Contributing

  • 🙋‍♂️ Yes, I'd be down to file a PR implementing this feature!

TRP-19

Funding

  • You can sponsor this specific effort via a Polar.sh pledge below
  • We receive the pledge once the issue is completed & verified
Fund with Polar
@KATT
Copy link
Member

KATT commented Apr 4, 2023

Cool. I'm optimistic we could support this. As a workaround, anyone who approaches this problem should be able to achieve the same perf gains using the SOA-approach: https://github.com/trpc/trpc/tree/main/examples/soa (or the PR that @juliusmarminge put up for you on calcom/cal.com#8041)

An outline of how this could work internally:

Rough how it currently works:

  • We store a dictionary with all procedures in a flat record, e.g. { 'post.byId': () => { /* ... */
  • When a query is done to post.byId, we do a simple lookup to this record at (if I remember correctly) router._def.procedures['post.byId']

For this to work, our routers need to be a bit smarter and be able to resolve stuff and cache already resolved imports. We also need to do split up the incoming procedure's name and by . to recursively resolve the path.

Potential "problematic" code paths

createCaller() which is currently isn't async and all other server-side paths. We'd need async equivalents that uses a proxy object?

@Nick-Lucas
Copy link
Contributor

Nick-Lucas commented Apr 7, 2023

We had some discussion (no decisions yet) internally on this, so just dumping some of my thoughts now here for wider visibility. tl;dr is I'm sceptical this is the right solution, but that doesn't undermine the problem existing and needing a solution.

So firstly a few assertions from other bits of information I've read on this issue, please correct me if I'm wrong:

  • The problem is a mixture of overall bundle size, and startup evaluation of the bundle. Combined causing slow cold starts on a serverless lambda
  • Currently the API is a single bundle file including dependencies, weighing around 2mb
  • The vast majority of the bundle size is going to be 3rd party dependencies. The actual application code is probably quite small

So I'm sceptical lazy loading is the true solution for a few reasons:

  • If bundle size is a problem, then code-splitting won't fix it, because a lambda would need to download all the "lazy" modules in the image before startup can occur. 2mb isn't that weighty though so I'm guessing not the primary issue here?
  • If startup evaluation is a problem, then lazy loading will probably have minimal effect, this because most routers will consume the expensive dependencies such as ORMs and service SDKs, and so you'll end up loading those anyway most of the time.
  • Also if startup evaluation is a problem, then bundling less into the bundle would probably be wise, and doesn't necessarily need to be done through lazy loading. I distribute a 20mb lambda at work, but with node_modules part of the distribution rather than bundled, and have experienced no startup delays which impact UX
  • If the sheer number and amount of code in service classes from a shared library are the problem, then a DI approach could probably be devised via createContext/middlewares which lazy-loads the service classes on demand. I actually have the opinion that tRPC should offer something in this space, but that's another topic. As things stand createContext is doing a lot of cross-cutting work so really could be part of the problem.

The solution to these problems in my view would be to break up the tRPC API into multiple skinnier APIs

...then merge the definitions for the client, and use a special links setup (or API Gateway equivalent) to route appropriately. @juliusmarminge has already POC'd this and it doesn't look that hard for us to provide some little abstractions in the form of special links, and a way to merge/prefix AppRouter definitions.

This would scale much better to truly massive APIs which are challenging the size limits of a lambda, or complexity limits of Typescript, while also reducing bundle size, and allowing you to chop up the API along dependency boundaries - so one API has prisma, another API doesn't, etc.

I don't think this is a perfect idea, inevitably dependency boundaries will exist throughout a API's tree rather than neatly by top-level router, and deep merging routers would probably be off the table.

In summary

...there's really nothing too offensive about adding lazy loading, and maybe my above reasoning is wrong. Also it's probably only a handful of lines of code to implement so we not a huge maintenance headache in the grand scheme. The biggest challenge for me here is the lack of data on the problem (is it cold-start time? is it actually request-start time thanks to a heavy createContext? is the code bundling strategy hurting the project? what gains will this actually offer vs other solutions?) given I'm looking in from the outside, and the lack of a prototype to compare data against.

@zomars
Copy link
Contributor Author

zomars commented May 12, 2023

Just a heads up. We actually went through and lazy loaded ALL tRPC procedures and helped reducing cold start boots from 16seconds to 2-3 seconds.

So this feature would've help a lot here.

@juliusmarminge
Copy link
Member

Just a heads up. We actually went through and lazy loaded ALL tRPC procedures and helped reducing cold start boots from 16seconds to 2-3 seconds.

So this feature would've help a lot here.

You separates into multiple lambdas too - do you have numbers on each change separately?

@zomars
Copy link
Contributor Author

zomars commented May 12, 2023

We'll be releasing a case study with more in depth numbers soon.

@KATT
Copy link
Member

KATT commented May 12, 2023

I'm actually unsure if this would've actually pushed the needle the same way the refactor to the SOA approach. The culprit is likely not trpc itself but rather each router's dependencies stacking up

@Nick-Lucas
Copy link
Contributor

Nick-Lucas commented May 12, 2023

Similar thoughts here, the long term solution to an API getting too large for serverless is to have multiple functions, not to lazy load bits. Lazy loading would just lengthen the runway so to speak. So if we add more 1st class support for solving these problems it will probably be by providing more primitives for merging several tRPC APIs into one client, that way we encourage and support the best practice solution.

Happy to be proven wrong of course, but I've been following this piece via Julius and gained the understanding that there were essentially a subset of routers with a heavy dependency around (ie.) i18n and carving that off made a big difference.

@KATT KATT added the linear label May 25, 2023
@KATT KATT changed the title feat: Lazy load routers [TRP-19] feat: Lazy load routers May 25, 2023
@KATT KATT removed linear labels May 25, 2023
@KATT KATT changed the title [TRP-19] feat: Lazy load routers feat: Lazy load routers May 25, 2023
@MJomaa
Copy link

MJomaa commented May 30, 2023

From a DX perspective the lazy loading syntax in the initial post is fantastically easy to understand and use.

@MathiasGruber
Copy link

Agreed, lazy loading would be an awesome feature to have

@MJomaa

This comment has been minimized.

@FelipeJz
Copy link

+1 Huge projects will benefit a lot!

@amritk
Copy link

amritk commented Sep 26, 2023

The solution to these problems in my view would be to break up the tRPC API into multiple skinnier APIs

...then merge the definitions for the client, and use a special links setup (or API Gateway equivalent) to route appropriately.

@Nick-Lucas this is exactly what I'm looking for and would be willing to help implement if needed. Is there a separate ticket tracking this feature?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants