PDQdb

A read-optimized, in-memory, columnar store database (column-oriented DBMS). It's pretty Damn Quick.

Instructions

Install goLang on your computer
$ git clone https://github.com/adam-hanna/PDQdb.git
$ make

Only builds for Linux and MacOS, for now

$ PDQdb -f "path/to/your/file.csv" -c "path/to/your/config.json"
http api (returns json)

Grab data by key: curl -v -XGET http://127.0.0.1:38216/key/{your key}
Count keys: curl -v -XGET http://127.0.0.1:38216/count
Grab data by query: see the section on weeQL

weeQL

weeQL is the query language written for PDQdb. It was inspired by SQL and implements many of the features of the SQL language; however, it is a very lightweight implementation and does not include all the functionality of SQL. Pronounced "wee" + "quill". Get it? Wee SQL? Tiny SQL?

All weeQL queries are made by sending a json query object using the http POST method to http://127.0.0.1:38216/query

The json query object is structured as follows:

{
  "SELECT":  [ "COL1", "COL2", ... ], // SITUATIONAL
  "COUNT": "*", // SITUATIONAL
  "WHERE":   { "FIELD1": "VAL1", "FIELD2": "VAL2", ... }, // see the section on where
  "GROUP BY": "COL1" // OPTIONAL
}

The "WHERE" property supports multiple logical operators:

`"$OR": [ { "FIELD1": "VAL1" }, {"FIELD2": "VAL2" }, ... ]`
`"$NOT": { "FIELD1": "VAL1", "FIELD2": "VAL2", ... }`
`"$NOR": { "FIELD1": "VAL1", "FIELD2": "VAL2", ... }`; This is the same as `"$NOT": { "$OR": [ ... ] }`
`"FIELD1": "$IN": [ "VAL1", "VAL2", ... ]`
`"FIELD1": "$NIN": [ "VAL1", "VAL2", ... ]` This is the same as `"$NOT": { "$IN": [ ... ] }`

Exporting and aggregating is explained in more detail, below.

1. Exporting Data

Exporting data is done with the `"SELECT"` query parameter (omit `"COUNT"` and `"GROUP BY"`)

Properties

`"SELECT" : [ “FIELD 1”, “FIELD 2”, … ]`: SITUATIONAL. An array of strings that indicates the columns to be returned. Omitted if using `"COUNT"`!
`"WHERE": { "FIELD1": "VAL1", "FIELD2": "VAL2", ... }`: a subdocument of filters. Multiple filters are returned as the intersection of data that meet each criteria (i.e. "FIELD1" = "VAL1" AND "FIELD2" = "VAL2"). Other logical operators besides AND are available. See the section on `"WHERE"` for more info.

example

query: ``` { SELECT : [ “ID” ], WHERE: { “A”: “foo” } } ```

yields:

[
  {
    "ID": "1"
  },
  {
    "ID": "3"
  },
  {
    "ID": "4"
  }
]

2. Aggregation

Perform counts on data with query parameters and an optional `"GROUP BY"` command (omit `"SELECT"`)

Properties

`"COUNT": "*"`: SITUATIONAL. The only value currently supported is "*". Omitted if using `"SELECT"`!
`"WHERE": { "FIELD1": "VAL1", "FIELD2": "VAL2", ... }`: a subdocument of filters. Multiple filters are returned as the intersection of data that meet each criteria (i.e. "FIELD1" = "VAL1" AND "FIELD2" = "VAL2"). Other logical operators besides AND are available. See the section on `"WHERE"` for more info.
`"GROUP BY": "COL1"`: OPTIONAL. Only valid with `"COUNT"` queries. This is the string column name by which to group count results.

examples

query #1:

{
  “COUNT”: “*”,
  "WHERE": {
    “A”: “foo”
  }
}

yields:

{ "COUNT": 3 }

query #2 with "GROUP BY"

{
  “COUNT”: “*”,
  "WHERE": {
      “A”: “foo”
  },
  “GROUP BY”: “B”
}

yields:

{
  "bar": 0,
  "barbar": 0,
  "barbarbar": 0,
  "test": 1,
  "testtest": 1,
  "testtesttest": 1
}

query #3 with "GROUP BY" and "$OR":

{
  “COUNT”: “*”,
  "WHERE": {
    "A": "foo",
    "$OR": [
      { "B": "testtesttest" },
      { "B": "testtest" }
    ]
  },
  “GROUP BY”: “B”
}

yields:

{
  "bar": 0,
  "barbar": 0,
  "barbarbar": 0,
  "test": 0,
  "testtest": 1,
  "testtesttest": 1
}

Example data

###example.csv

ID,A,B
1,foo,test
2,foobar,testtest
3,foo,testtesttest
4,foo,testtest
5,foobarbar,bar
6,foofoo,barbar
7,foofoo,barbarbar

###example_config.json

{
  "header": [
    {"ID": "string"},
    {"A": "string"},
    {"B": "string"}
  ],
  "id_field": "ID",
  "index_fields": ["A", "B"],
  "start_at_line": 2
}

Name		Name	Last commit message	Last commit date
Latest commit History 63 Commits
Godeps		Godeps
cli		cli
data		data
error		error
index		index
input		input
query		query
server		server
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
main.go		main.go

License

jonathanmarvens/PDQdb

Folders and files

Latest commit

History

Repository files navigation

PDQdb

Instructions

weeQL

1. Exporting Data

Properties

example

2. Aggregation

Properties

examples

Example data

About

Resources

License

Stars

Watchers

Forks

Languages