Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addition of PostgreSQL and individual tables in order to collect data from all Jovian runs #1

Open
DennisSchmitz opened this issue Apr 1, 2019 · 0 comments
Assignees
Labels
enhancement New feature or request

Comments

@DennisSchmitz
Copy link
Owner

DennisSchmitz commented Apr 1, 2019

This FeatureRequest was originally posted by @florianzwagemaker on the internal GitLab repo. Transferring it to GitHub.

In order to collect data from all Jovian runs, a sql-based database is necessary in order to efficiently store larger sets of information.

PostgreSQL is a (relatively) easy database which is able to handle medium to very large datasets without compromising resource usage.
Additionally, Postgres is able to scale adequately to the data-needs of the Jovian project while maintaining flexibility, this allows the database and it's structure to be modified when necessary.

This is something that other databases such as MySQL/MariaDB or NoSQL(Cassandra,MongoDB or Couchbase) do not provide up to the same degree which makes them harder to use either for larger datasets or in terms of flexibility.

A PostgreSQL database can be accompanied in Jovian with an easy to use linux tool named Csvkit
With Csvkit it will be possible to insert data into the PSQL database through shell commands in a snakemake rule.
An example of data insertion into PSQL through one of Csvkit's commands is csvsql --db postgresql:/// --insert /path/to/unclassified_data.csv

@DennisSchmitz DennisSchmitz added the enhancement New feature or request label Apr 1, 2019
@florianzwagemaker florianzwagemaker self-assigned this Apr 1, 2019
@florianzwagemaker florianzwagemaker changed the title Addition of Addition of PostgreSQL and individual tables in order to collect data from all PZN runs Addition of PostgreSQL and individual tables in order to collect data from all PZN runs Apr 2, 2019
@DennisSchmitz DennisSchmitz moved this from Issues to To do (long term) in general enhancements May 16, 2019
@DennisSchmitz DennisSchmitz changed the title Addition of PostgreSQL and individual tables in order to collect data from all PZN runs Addition of PostgreSQL and individual tables in order to collect data from all Jovian runs May 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
general enhancements
  
To do (long term)
Development

No branches or pull requests

2 participants