Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better terms for supergraph, subgraph and public API needed. #11

Open
michaelstaib opened this issue Jan 23, 2024 · 12 comments · Fixed by #12
Open

Better terms for supergraph, subgraph and public API needed. #11

michaelstaib opened this issue Jan 23, 2024 · 12 comments · Fixed by #12

Comments

@michaelstaib
Copy link
Member

The term graph is not really used in the graphql core spec. When we talk about the composition in the boundaries of the spec we are using schema documents to compose from it a annotated single schema. I discussed this a bit with @benjie and we both feel that supergraph, subgraph and public API are not good terms.

"SubSchema" => a partial part of the "Superschema"
"SuperSchema" => annotated schema aka executor config
"PublicSchema" => the schema we expose to the enduser

These are also not good terms and we need to reflect a bit on better terms.

@benjie
Copy link
Member

benjie commented Jan 23, 2024

How about "source schemas" for the schemas that are combined together (i.e. "SubSchema"), "composition schema" or "internal composed schema" for the schema that details how everything composes (the "SuperSchema"), and "final schema" or "resulting schema" or "exposed schema" or "external composed schema" for the schema that consumers of the API would see? We should avoid the term "public" IMO.

@michaelstaib
Copy link
Member Author

I like source schema as a term for what is currently called subgraph.

@martijnwalraven
Copy link

martijnwalraven commented Feb 1, 2024

As we discussed in the last meeting, I think we should reopen this issue. In my mind, the term source schema is too generic and not descriptive enough for our purposes.

What we've tried to convey with the term subgraph schema at Apollo is that these schemas are designed as part of a larger graph (which is also why we call out 'composable' as a design principle). In our experience, the notion of subgraphs composing into a supergraph makes intuitive sense to people and helps clarify the architecture.

I see how the term graph doesn't occur explicitly in the GraphQL specification, but it's called GraphQL for a reason :) A schema does describe a data graph, and I don't think we should shy away from using that term. Composition isn't just a syntactic process that refers to combining schema documents, but a way of building a larger graph out of component parts.

@benjie benjie reopened this Feb 1, 2024
@dsandip
Copy link

dsandip commented Feb 8, 2024

Disclaimer/context: I work extensively with GraphQL practitioners in Hasura’s community/user base (and also speak to a lot of GraphQL users who don’t use Hasura).

A lot of the users in the GraphQL ecosystem will be in agreement with @martijnwalraven on this - the terms subgraph and supergraph are very well and implicitly understood by developers (without any vendor-specific connotations). I believe these terms are very helpful when talking about development, ownership, CI/CD, governance, etc. of the “entities”/schemas being composed into a single unified graph.

I would like to propose that that we stick to subgraph and supergraph as the choice of terms.

PS: Still reading up on public API or PublicSchema, and will shortly get back on that as well.

@benjie
Copy link
Member

benjie commented Feb 9, 2024

Thank you for sharing your thoughts on these topics @martijnwalraven and @dsandip; your wealth of experience with these terms is very valuable.

I think it's worth noting that the colloquial term for something and the strictly specified term do not have to match, but the specified terms must be very well defined and unambiguous.

being composed into a single unified graph

The GraphQL specification itself only uses the standalone term "graph" twice:

user represents one of many users in a graph of data, referred to by a unique identifier

and

The graph of fragment spreads must not form any cycles including spreading itself.

The latter of which is referring to graph in the mathematical sense, and the former is talking about your business entities, not the actual GraphQL schema itself. As Martijn said: "A schema does describe a data graph", or more specifically a schema provides an API to describe, traverse, access data from and mutate a data graph.

In GraphQL terms, I'd argue that a "single unified graph" would refer not to the schema, but to the business entities that the schema may represent - the data graph. A schema you build to access this data graph might be referred to as a "single unified schema".

Expanding on this, the term "subgraph" and "supergraph" would appear to refer again to the business entities - the data graph; whereas terms like "subschema" and "superschema" would refer to the schemas themselves.

Then we have the issue that "subschema" sounds like it's "less than" a schema, and "superschema" sounds like it's "more than" a schema. The terms are also confusing when you think about subclasses and superclasses - a subclass inherits from a superclass, but in GraphQL a superschema inherits from a subschema? Of course the real analogy is to set theory: the subschemas are a subset of the superschema, and the superschema is a union of all the subschemas. Even then the "superschema" doesn't add things that weren't present in the subschemas, so is it truly a superset? 🤔

What we're talking about really are a number of smaller GraphQL schemas that are composed together to form one larger GraphQL schema that represents them all. This composite schema then sources data from these smaller schemas to honour requests.

the term source schema is too generic and not descriptive enough for our purposes.

This is indeed valid criticism, I can definitely imagine that the term "source schema" could be used in a lot of contexts (e.g. proxying, schema stitching, client usage, etc), we could make it more specific by prefixing with the use case: "composition source schema", or come up with a better term. I'm not aware of the term in popular usage in other scenarios though?

TL;DR: My primary concern is that the term "graph" in specification terms refers to the underlying data graph, not the schemas that you might build to enable access to this data graph. When we refer to schemas in the specification, we should call them "schemas".

@martijnwalraven
Copy link

TL;DR: My primary concern is that the term "graph" in specification terms refers to the underlying data graph, not the schemas that you might build to enable access to this data graph. When we refer to schemas in the specification, we should call them "schemas".

That's a fair distinction. We use the terms subgraph schemas and supergraph schemas when explicitly referring to the GraphQL schemas instead of the data graphs themselves. That also avoids terms like subschema and superschema, which as you point out are misleading because it doesn't really make sense to think of these as 'less than' or 'more than' a schema. They are both complete schemas that describe particular graphs, but subgraph and supergraph refer to their roles within a federated architecture.

@dsandip
Copy link

dsandip commented Feb 14, 2024

Great points @benjie @martijnwalraven! I think subgraph schema and supergraph schema should do it.

@smyrick
Copy link

smyrick commented Feb 14, 2024

I think the terms subgraph and supergraph also work well as prefixes to describe the other parts we will probably need to mention in the spec but not clearly define, like the runtime. We have agreed so far the this spec should focus on composition and not runtime implementations, like query planning.

However we can still refer to those components like subgraph server and supergraph server (router/gateway) or subgraph resolver and this clear dictates which part architecture you are referring to

eg

The supergraph server validates requests from clients using the supergraph schema. It then should send the appropriate request to a subgraph server which handles that part of the subgraph schema.

@benjie
Copy link
Member

benjie commented Feb 15, 2024

(I think the term "server" should be avoided, "service" is preferable as in the GraphQL spec. My feeling is we don't need to get into transport mechanics in this spec.)

@smyrick
Copy link

smyrick commented Feb 15, 2024

That makes sense. Using service then instead still draws the distinction of the service vs the schema

@smyrick
Copy link

smyrick commented Feb 15, 2024

We did a brainstorm on the meeting on Feb 15 and came up with some more terms

  • Gateway schema
  • Super schema
  • Public schema
  • Executable schema
  • Downstream service schema
  • Source schemas
  • Subgraph schemas
  • Composed and merged schema
  • Stitched schemas
  • Wooven schemas

@JohnStarich
Copy link

JohnStarich commented Feb 15, 2024

Some ideas / takes:

  • Service goes from meaning "one" to "many" compared to the GraphQL spec, so could use "service schema"
  • Technically the executable schema idea is provided as one possible implementation, albeit a decent one, so I don't mind the name being slightly more verbose. "Executable gateway schema", for example.
  • The gateway schema could inherit its name from the discussion in What term should we use for a federated gateway / router in this specification? #17

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants