Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion: decoupling async data loading from async graph resolution #171

Open
jmarshall9120 opened this issue May 25, 2022 · 0 comments
Open

Comments

@jmarshall9120
Copy link

jmarshall9120 commented May 25, 2022

TLDR: I would like graph-core to tell me what "leaf" nodes from my query need returned. Allow me to fetch the data through whatever method I deem efficient. Then I would like graph-core to take care of returning the data. The current asyncio implementation, seems to fundamentally not work for this.

I've been digging in deep on a graphql-core build recently and have stumbled across this interesting problem. If I've missed a key feature of the library here, than please point it out.

To me, the ideal way to use the core library is to:

  1. Use graph-core to decide what data to retrieve.
  2. Use a separate data loading engine to read the data.
  3. Use graph-core to return the data.

Its interesting to see where this problem fits in as either a graphe-core-3 issue, needing a feature, or a graphql issue. The essential catch is this, it's very hard to determine when the graph-core resolution has finished deciding what leaves from the graph need fetched. Here's an example to illustrate the point.

##########################################################
## TEST GRAPHE SCHEMA BASED ON ASYNCIO ######################
##########################################################
from graphql import (
    GraphQLBoolean, graphql, GraphQLSchema, GraphQLObjectType, GraphQLField, GraphQLString)
import logging
import asyncio

_logger = logging.getLogger('GrapheneDeferralTest')
_logger.setLevel('DEBUG')

query = """
{
    ioHMIControls {
        EStopHMI,
        JogHMI,
    }
}
"""

async def resolve_EStopHMI(parent, info):
    _id = '_EStopHMI_id'
    info.context['node_ids'][_id] = None
    await info.context['awaitable']
    return info.context['node_ids'][_id]
EStopHMI = GraphQLField(
    GraphQLBoolean,
    resolve=resolve_EStopHMI
)

async def resolve_JogHMI(parent, info):
    _id = '_JogHMI_id'
    info.context['node_ids'][_id] = None
    await info.context['awaitable']
    return info.context['node_ids'][_id]
JogHMI = GraphQLField(
    GraphQLBoolean,
    resolve=resolve_EStopHMI
)


def resolve_ioHMIControls(parent, info):
    return ioHMIControls
ioHMIControls = GraphQLObjectType(
    name='ioHMIControls',
    fields={
        'EStopHMI': EStopHMI,
        'JogHMI':JogHMI,
    }
)

def resolve_GlobalVars(parent, info):
    return GlobalVars
GlobalVars = GraphQLObjectType(
    name='GlobalVars',
    fields={
        'ioHMIControls': GraphQLField(ioHMIControls, resolve=resolve_ioHMIControls)
    }
)

async def simulate_fetch_data(_ids):
    print(_ids)
    await asyncio.sleep(1)
    return {k:True for k in _ids.keys()}
    
async def main():
    # Objective:
    #     1. Have graph determine what data I need by partially resolving
    #     2. Pause graph resolution.
    #     3. Collect data into a `data_loader` object.
    #     4. Retrieve data via `data_loader` object.
    #     5. Resume graph resolution with loaded data.

    # 3. collect ids of data fields into a dict
    _ids = {}

    #2. pause graph resolution by awaitn a future
    future = asyncio.Future()
    context = {
        'node_ids': _ids,
        'awaitable': future,
    }
    schema = GraphQLSchema(query=GlobalVars)

    # 1. Determine WHAT data to return
    resove_graph_task = asyncio.create_task(graphql(schema, query, context_value=context))

    # ?
    # There is no way to detect that resolve_graph_task
    # has finished fillin _ids dict with id values.

    # 4. Fetch the data
    fetch_data_task = asyncio.create_task(simulate_fetch_data(_ids))

    # ? 
    # This await doesn't work in this order or any order
    # becaus of the interdependancy of both tasks, coupled with 
    # the mechanics of asyncio.
    await fetch_data_task

    # 5. Resume graph resolution with retrieved data.
    future.set_result(0)

    # ? 
    # return the data from the graph, as a graph result. 
    # problem, is that the data is not there due to 
    # interdependancy between await tasks. 
    result = await resove_graph_task
    print(result)

if __name__ == '__main__':
    asyncio.run(main())

Results

{}
ExecutionResult(data={'ioHMIControls': {'EStopHMI': None, 'JogHMI': None}}, errors=None)

The example is a little long, but I wanted it to be sufficiently complex. The gist is that there is no way in the current asyncio implementation to determine that: all resolvers have been reached.

Looking at the implementations we could use some advanced event systems to manage this, but it would be a bit of work. Another possible solution could be to allow resolvers to return coroutines and put off type checking till those coroutines are themselves resolved. I think, this may be the most elegant method.

Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant