New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Schemas importing other schemas #243
Comments
I would personally prefer not to introduce dependence on URLs and instead embrace IPLD native links instead, e.g syntax for the bringing defs from other schema may look like regular type def / alias
|
is really just But yeah, just CIDs would be good, no need for URLs, especially if we get CIDv2 with its additional funky possibilities for use here. There is a bit of weirdness though - the link should point to the DMT form of a schema, but you're going to be mostly representing it inside a DSL, presumably to be compiled into a DMT. I wonder then what a workflow would look like for a set of linked schemas? Perhaps this is a non-problem where you're building on an existing schema that's already "shipped", but perhaps you have a whole set of complex schemas you want to join together — like the ones Vulcanize did for the ETH chain structure @ https://ipld.io/specs/codecs/dag-eth/. What would be the process to get from the DSL to the DMT for these? Would we need a placeholder prior to some kind of "compile" step, or just accept the fact that you need to jump through some hoops to compile the schemas yourself in dependent order? |
Cool, I agree that a new keyword would be useful for encapsulating this new functionality. Also, agreed about not involving URLs in this. 😁 It does feel like being able to use We could have the the DMT of the dependency imported along side, and the This would result in the DMT/DSL specs diverging a bit more, but I don't think it'd be the end of the world given we already diverge for stuff like comments. |
In that case we'd need a rule like "DMT refers to imports strictly by CID, but DSL can refer by url-ish as long as the toolchain used to create the DMT supports it". Thinking through some use-cases here one obvious one is Bindnode on the Go side where we have a higher-level abstraction that allows you to register a type using a schema DSL and it'll do the compiling for you, e.g. https://github.com/filecoin-project/go-fil-markets/blob/727a2b14a263ebfaead1a1bacd56e1149234d549/retrievalmarket/types.go#L543 We could also come up with at "SchemaLoader" interface that you could pass in as one of the options to this thing that would allow you to both load a schema by CID or by URL/path and have that fed into the compile process. So would it be better for the DSL to use CIDs or actual URLs, including file URLs? Or only CIDs or file paths? It's getting pretty complicated if we go with URLs but I can imagine that being quite handy. With that example above, we're already pushing to have some of these components spread across repos, specifically in filecoin-project/go-state-types#49 where there are generic Filecoin types which we'd be pulling in to various places where we use Bindnode. So you might end up with schemas that want to pull in pieces from other repos. Of course, having a fixed CID for that would be nice, but where do we record the CID for the version we want? Perhaps it'd be more useful to be able to refer to a github raw url to the version (commit) you want and have the toolchain compile and work out that CID for you. Of course then there's questions of what you're doing with these multiple things when you do compile them, at least with a DSL->DMT transformation we just hand you a single |
2022-10-04 triage conversation: @RangerMauve will turn this into an exploration report. |
@RangerMauve I think it would be a great idea to also evaluate idea from unison language, which uses hash based, as opposed to named, referencing which in turn eliminates need for imports and seems to be natural feet for hash linked data. @jaredly also have written whole new language which, among other things, explores this idea in typescript. This is something I always wish had time explore, but never got around to so I thought I'd surface it here. Conceptually idea is pretty simple, you ignore type names & instead replace those with CID of the definition and swap all references by that name with a CID. Locally you could built up db of name -> CID mappings e.g from local source (which could include be included through git submodules, package managers or whatever tool just needs to know what to index). If I'm not mistaken both unison and jerd use field order to determine naming which is probably not great from the schema extensibility perspective (e.g. if I add a new field at the top that would make new type incompatible with an old one), but then again maybe that is ok and users just need a heads up or maybe it makes more sense to retain field names here |
fwiw I'm rewriting jerd to use field names instead of field order (I determined that names have important semantic value that I don't want to discard) |
Hmm. Regarding unison and jerd, these are whole new schema languages and it feels like a separate set of constraints to what IPLD Schemas already do. IPLD Schemas have the DMT format and a way to hash data in that format already so it seems more straightforward to link to that than reevaluate how schemas should work. I'll defs read up on those for inspo though. Gonna work on an exploration report tonight to dream up the UX and talk about tradeoffs / caveats. |
One thing that might be useful is to avoid This would make some of the preprocessor steps more simple in that one could check if the types in the imports are being used in the current schema before bothering to check if they exist in the remote schema, and having a quick way to verify that they exist in the remote schema. Otherwise we might run into weird cases where types cause conflicts over time. |
Thinking through the mechanics of this, it might be good to namespace types such that they have to be explicitly brought in to scope and not just imported by name and copied. e.g. type Internally, when building a DMT to represent all of this, this could be represented with prepended CIDs, so as soon as you refer to some other Schema, that whole schema (or perhaps just the tree we care about, that probably wouldn't be too hard) is brought in and the types become One might opt to import all of the types from a schema, but you have to do so explicitly; and we could defer any |
Looking at schemas and integrating dynamic loading of schemas, some stuff has stuck out to me.
I think these things could be addressed by adding the ability to import schema types from another schema using a CID.
The syntax could look something like the ESM imports API:
The second line could show how we can rename a type when importing it to avoid conflicts, or import just a subset of types.
I think this will be really useful for future integration with other data ecosystems like schema.org
I'm down to work on speccing this out and adding something to the JS side of things.
This does mean that schema validators would end up depending on IPLD URLs and LinkSystems. 😅
cc @rvagg @warpfork @Gozala What do y'all think?
The text was updated successfully, but these errors were encountered: