Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE]: Empowering Dgraph with Multilingual Capabilities Through AST for Mutations #8887

Open
MichelDiz opened this issue Jun 26, 2023 · 3 comments
Labels
dgraph Issue or PR created by an internal Dgraph contributor. kind/feature Something completely new we should consider.

Comments

@MichelDiz
Copy link
Contributor

Use case

By introducing Abstract Syntax Trees (AST) for mutations, Dgraph can evolve into a truly multilingual database engine, effortlessly catering to a wide array of data languages and formats such as CSV, JSON-LD, RDF variants, and more. Below are compelling reasons for adopting AST for mutations in Dgraph:

Language Agnosticism: By using AST, Dgraph can become language-agnostic. Since AST is a hierarchical and abstract representation of source code, it can represent various data formats without being tied to any particular language syntax. This ensures that Dgraph can accept and interpret data in diverse formats seamlessly.

Ease of Parser Creation: With an AST-based approach, creating parsers for different languages becomes a trivial task. Parsers can be developed to convert source code in various languages into an AST, which Dgraph can then process uniformly. This enhances flexibility and makes Dgraph adaptable to any data writing language.

Enhanced Interoperability: In a multi-platform environment, systems often use different data formats. By supporting multiple languages through AST, Dgraph can easily integrate and communicate with different systems, regardless of the data format they employ. This interoperability is essential for building cohesive and robust systems.

Future-Proofing: As new data representation formats emerge, the ability to easily create parsers for them ensures that Dgraph remains relevant and adaptable. This future-proofing is invaluable in a fast-paced technological landscape where new standards can quickly become the norm.

Wider Adoption and Collaboration: By accommodating various data formats, Dgraph becomes an appealing choice for a broader audience. This wider adoption can foster a collaborative ecosystem, with contributions from a diverse community, enhancing the overall quality and capabilities of Dgraph.

Reduced Complexity for End Users: End users often work with data in formats they are most comfortable with. By supporting multiple languages, Dgraph allows users to interact with the database in their preferred format, reducing the learning curve and complexity, and enhancing user satisfaction.

But the major advantage of having an AST for mutations is the ability to make Dgraph multilingual, with little effort. Both the community and ourselves could create parsers for any type of mutation.

For example, SQL:

INSERT INTO estudents (id, name, age)
VALUES (1, 'John ', 20);
UPDATE estudents
SET age = 21
WHERE name = 'John';

We could have a parser in the Beta group that converts it into a Dgraph native AST.

Other examples include

JSON Patch:

[
  { "op": "add", "path": "/estudent/1/nome", "value": "Maria" }
]

XML:

<estudent>
  <name>Maria</name>
  <age>22</age>
</estudent>

Apache Avro:

{
   "type": "record",
   "name": "Estudent",
   "fields": [
      {"name": "id", "type": "int"},
      {"name": "name", "type": "string"},
      {"name": "age", "type": ["int", "null"]}
   ]
}

In conclusion, incorporating AST for mutations in Dgraph is a strategic move that paves the way for a more versatile, performant, and future-proof database engine. By breaking the barriers of language limitations, Dgraph can become a universal solution catering to diverse data formats, and consequently, a larger, more diverse user base.

Links to Discuss, RFC or previous Issues and PRs

No response

Links to examples and research

No response

Current state

No response

Solution proposal

No response

Additional Information

No response

@MichelDiz MichelDiz added kind/feature Something completely new we should consider. dgraph Issue or PR created by an internal Dgraph contributor. labels Jun 26, 2023
@benwoodward
Copy link

Yes! Anything that adds flexibility is a good thing IMO. The main issue I can see with this is that if you're not careful you would explode the surface area of code that needs to be maintained—assuming you would be maintaining the various AST adapters, e.g. REST → AST, SQL → AST etc.

Would you be building adapters too, or just providing an AST endpoint?

@MichelDiz
Copy link
Contributor Author

@benwoodward The initial idea is in the beginning to allow the possibility of sending ASTs (either query or mutation). With this we can edit the dql package (which you can find in the dgraph repo /dql) and create new DQL syntaxes. I use this to explore the DQL parser. But the other day I realized that it was enough to expose the AST that anything would be possible. Any parser can be made! No need to rely on the core code. It would work as an addon/plugin. Everyone interested in creating an "unofficial parser" would be able to.

Even if no one in the community does. This would greatly facilitate the process of creating new parsers for other languages or even testing new syntaxes for DQL.

@MichelDiz
Copy link
Contributor Author

Actually with the decoupling of the main core code. It would be easier to maintain. We could have a repo just for parsers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dgraph Issue or PR created by an internal Dgraph contributor. kind/feature Something completely new we should consider.
Development

No branches or pull requests

2 participants