Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thoughts: ORM, DDD conflicting? #22

Open
akomm opened this issue Jan 29, 2018 · 3 comments
Open

Thoughts: ORM, DDD conflicting? #22

akomm opened this issue Jan 29, 2018 · 3 comments

Comments

@akomm
Copy link
Member

akomm commented Jan 29, 2018

Prelude:

When I found the dataloader concept by facebook via this package, it was the last piece that missed in the whole concept behind graphql, to make it "composable" without 1+n overkill. Before, I had to use ResolveInfo (overblog/graphql) to look ahead the selection further, than the resolver for the specific type needed to be aware of. This also caused the (doctrine) hydration due to multi-join blow-up of the result set to become extremely time consuming.

Against an graphql API of an existing application with some sample of about ~450 datasets in different related t ables, I made a super-query against everything. Then I started to adopt dataloader everywhere.
I started at about 1700 queries. Applying dataloading on all entity resolvers reduced it to 1500. What I have noticed is, that in most cases it almost does not help in 1:1 situations, because doctrine ORM is smart enough with lazy loading (in this case proxy) to keep the object deduplicated and so once an entity is loaded in a single relation, same entity is loaded everywhere. So the 200 loaded are "different" 1:1 entities of different proxy objects with different identities. The dataloading actually would have saved more, when doctrine not already did most of the job.

What did the most troubles were the associations between entities {1,*}:n.
When using ORM, you get the lazy collection, which has no means to efficiently load Ids and even if so, you would still query all the ids and kind of have still 1+n. Without or, you still would have to query the ids for each association.

So I came with the idea to make a loader for association, which keys by parent id. It then queries for all the requested parents its children, group them by parent/order correctly (which is kind of tricky, but works). Then I "pipe" the ids into the EntityLoader. And using class metadata from doctrine the forwarding can be automated (though the association loader still caches only the id list).

Works, now I have 6 queries for ALL apps data (from previously 1700) with lots of :n relation, no matter how much depth I have in the graphql query. And the funny thing is: no huge joins, only implicit on n:m tables to batch-query association ids (so at most depth 2).

Actual problems (or not?):

Now come things like computed properties, which I usually implemented on domain objects (for example said entities). The computed data might depend on owned & associated data of an entity, which it needs to access to compute the result. The problem is, that it obviously bypasses dataloading, when I access those (lazy-)collections inside the entity to compute some result. It does not matter whether all the associated entities are managed by entity manager at this point, because the lazy collection did not yet fetch the associations. It would still trigger a query for each entity you want to compute a result for.

Before dataloader I looked into ResolveInfo inside the resolver for the computed property and join-fetched data, which was required for the computation. A bad thing, because resolvers had to know what is required inside the entity to compute the result...

So what? Well I could make dataloader for said computed values, but that would require me do ditch much of the model from entities. So far I did not find any other solution for it. I'm not quite sure if that is "normal" or I just not see "it".

I'm at a point of thinking to ditch doctrine in this case and have my dataloading API become the domain for data access, with flexible implementation behind it (could be simple in-memory array source for testing, or backaed by database). Because ORM looses any point as it looks to me, when I can not implement my model.

@mcg-web
Copy link
Member

mcg-web commented Jan 30, 2018

Hi, thanks for this long feedback take me some time to read it but now I'm done 😆. I think using a DBAL or even PDO is much more powerful with GraphQL since the lib knows how to deal with array and object at the same time. That's just a general concern for performance anyway even with dataloader and ORM's you can do a powerfull work too (just need some more efforts). The key is chaining promises or/and nesting dataloaders to do your treatments, you must do the less directly in your resolver.

Promise chaining example:

$userLoader->load(1)
  ->then(function ($user) { return "User $user->name"; });

Nesting dataloaders :

$userLoader->load(2)
  ->then(function ($user) use ($userLoader) { 
    return $userLoader->load($user->invitedByID); 
  });

@akomm
Copy link
Member Author

akomm commented Jan 31, 2018

@mcg-web either I misunderstand you or you did misunderstand me :)

This is all easy.

But taking your example, you access the invitedBy not by $user->getInvitedBy() anymore. And that is a trivial example. What if you had $user->getSomeDataComputedFromAssociatedData() ?

You could do this with dataloading too, by creating a dataloader, which uses other dataloaders to get the required association(s), then once those are all "resolved", computed the result - all that using promise chain (then, or Promise::all, etc).

The point I was talking about is:

The more I adopt dataloader, the less is left of my previous model on the entities.

So my issue is not how to use them. I'm quite common to functional programing as frontend developer (js, rustlang), observables (rxjs), promises, etc. so it was not hard to see the how to combine and chain.

It is rather I'm not sure it is just the way it has to be than, or I'm just not seeing some opportunity to maintain the model (because I do not want to duplicate logic) in entities, while using dataloaders.

@mcg-web
Copy link
Member

mcg-web commented Feb 25, 2018

I don't understand the issue, while using dataloader if you don't need ORM anymore, why keeping it? Dataloader is a way of optimizing your request to DB while using graphql by example, if ORM does already the work so you don't need dataloader! It's only you that can know what is the best to keep?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants