Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Seperate query from data for easier and safer operations #20879

Closed
Virock opened this issue Apr 30, 2024 · 3 comments
Closed
Labels
1 Question 2 User Abandoned Resolution 3 AQL Query language related

Comments

@Virock
Copy link

Virock commented Apr 30, 2024

I'm trying to add this issue as a feature request but I don't see any button I could click to mark it as such. This is not a bug report. It's a feature request.

My Environment

  • ArangoDB Version: 3.11.1
  • Deployment Mode: Single Server
  • Deployment Strategy: Manual Start
  • Configuration: N/A
  • Infrastructure: Own
  • Operating System: Windows 10
  • Total RAM in your machine: 16gb
  • Disks in use: SSD
  • Used Package: Windows Installer

Component, Query & Data

Affected feature:
AQL query with driver

AQL query (if applicable):

let documents = "";
let bindVars: any = {};
for (let i = 0; i < data.length; i++) {
                documents += `{word: @${i}word, definition: @${i}definition},`;
                bindVars[`${i}word`] = data[i].word;
                bindVars[`${i}definition`] = data[i].definition;
            }
documents = documents.substring(0, documents.length - 1);
databaseQuery = "
FOR doc IN [
${documents}
] INSERT doc IN SomeCollection
";
db.query({
                query: databaseQuery, bindVars
            });

AQL explain and/or profile (if applicable):
N/A

Dataset:

{
  "data": [
    {
      "word": "Enchanting",
      "definition": "delighting or fascinating someone"
    },
    {
      "word": "Captivating",
      "definition": "attracting or holding the interest of someone"
    },
    {
      "word": "Irresistible",
      "definition": "impossible to resist or ignore"
    }
  ]
}

Size of your Dataset on disk:
N/A

Replication Factor & Number of Shards (Cluster only):
N/A

Steps to reproduce

  1. Try to insert data into the database from a JSON array

Problem:
I have to convert the JSON array (Already formatted data) into a query string while creating bindings to avoid injection attacks

Expected result:
Wouldn't it be much better to handle this the way MongoDB does?
The query and the data should be separated.
Instead of passing a query (Which includes the instructions for the database from the developer and the data which is untrusted information from a potential bad actor) and bindVars arguments to the driver (The entire reason for bindings is because the data and instructions are in the same place). Wouldn't it be better to pass a query (Only instructions for the database) and data (JSON).
So, I'll be able to insert data like so:

db.query({
query: "INSERT IN SomeCollection",
data: [
    {
      "word": "Enchanting",
      "definition": "delighting or fascinating someone"
    },
    {
      "word": "Captivating",
      "definition": "attracting or holding the interest of someone"
    },
    {
      "word": "Irresistible",
      "definition": "impossible to resist or ignore"
    }
  ]
})
@jsteemann
Copy link
Contributor

@Virock : the separation of the query string and the so-called "bind parameters" is already possible.
For example, in the ArangoShell (arangosh) you could use something like:

db._query("FOR doc IN @@collection FILTER doc.value == @value RETURN doc", { "@collection": "foo", "value": "bar" });

Bind parameters start with @. Single-@ bind parameters are value bind parameters, and bind parameters with two @@ are collection name bind parameters.

The database's HTTP API also supports this separation since the very beginning. On the REST API level, it would look like this:

curl -X POST http://127.0.0.1:8529/_api/cursor --data '{"query":"FOR doc IN @@collection FILTER doc.value == @value RETURN doc", "bindVars": {"@collection": "foo", "value": "bar" }}'

The same separation should also be supported by drivers.

@Virock
Copy link
Author

Virock commented Apr 30, 2024

I think I might not have described the feature properly.
What I'm saying is that, bindings are needed because the query and data are in the same place. The privileged instructions for the database (query) and the data (Potentially dangerous input from users) are in the same place.

If I have data as such:

{
"a": "b"
}

I currently have to add that data into a query and of course, put bindings to avoid AQL injection because the data was placed into the query. I'm trying to say that the initial problem is the fact that the unsafe data was added into the query.

The simple solution is to do what MongoDB did. They separated the data from the query entirely.
This does 2 things.

  1. The data (JSON) doesn't need to be changed into anything else anymore. You simply pass the data as is to the system.
  2. There is no longer any need for bindings because the data is no longer in a privileged location (query).

I won't need to start writing for loops every time I get a JSON array that needs to be placed into the database.
I would simply write something in the line of:

db.query({
query: "INSERT IN SomeCollection",
data: {
"a": "b",
"c": "INSERT IN ABC"
}
});

Notice the key c has something that looks like AQL injection but it won't matter because it isn't in the query. It's just a string that needs to be placed in the database.

Do you understand my point now?

@jsteemann
Copy link
Contributor

@Virock : I think I don't yet get the point.
Doing something like

db._query("FOR doc IN @@collection FILTER doc.value == @value RETURN doc", { "@collection": "foo", "value": "bar" });

in ArangoDB should do exactly that. Whatever value gets put into @@collection here will be interpreted as a collection name parameter. Whatever value gets put into @value will be interpreted as a JSON value (e.g. the string "bar"). None of the bind parameter values can change the meaning of the query.
For example, trying to perform parameter injection such as

db._query("FOR doc IN test FILTER doc.value == @value RETURN doc", { "value": "1 || (INSERT {} INTO test)" });

is pointless exactly because the bind parameter values won't change the meaning of the query in any way.

@Simran-B Simran-B added 3 AQL Query language related 1 Question 2 User Abandoned Resolution labels May 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1 Question 2 User Abandoned Resolution 3 AQL Query language related
Projects
None yet
Development

No branches or pull requests

3 participants