Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Adding an import method for fast import of CSV #1086

Open
wants to merge 29 commits into
base: master
Choose a base branch
from

Conversation

antonycourtney
Copy link

I'm the developer of Tad, a fast, free CSV file viewer and pivot-table based UI for exploring and analyzing tabular data, implemented in JavaScript. Tad works by immediately importing target CSV files into sqlite, and using sqlite (via node-sqlite3 of course) for all subsequent analytical operations.

In order to achieve high performance for large CSV files, I extracted the import code implemented in C in the sqlite3 shell, and wrapped it as a library routine in my fork of node-sqlite3. The result is an easy way to import CSV files into sqlite from JavaScript, with great performance.

Usage:

var db = new sqlite3.Database(':memory:');
...
var options = { columnIds: ['colA', 'colB', 'colC'], delimiter: ',' };
db.import(testPath,'staging', options, function (err, res) {
  ...
});

I'm happy with the functionality and implementation, but it's a bit unfortunate that I'm maintaining my own fork of node-sqlite3. I'd love to either upstream this code or move it to a separate library.

Since this may be generally useful to a large number of node-sqlite3 users, would you consider accepting this as a PR?

@kewde
Copy link
Collaborator

kewde commented Nov 26, 2018

Hi,
Thanks for the pull request, I think it is an interesting feature. However, this would require tests to be included before it can go through. Would you be kind enough add tests for this functionality?
Please remove the lock files also.

@antonycourtney
Copy link
Author

Hi @kewde - Thanks for the reply (and sorry for delay in mind). I'd be happy to add tests and remove the lock files. Will do.

@antonycourtney
Copy link
Author

Hi @kewde - I added a handful of tests and removed the lock files from this PR. Unfortunately I'm having no end of difficult getting this to pass CI. On Linux with NODE_VERSION="9" and earlier it seems to be crashing when running node test/support/createdb.js which is extremely perplexing since that shouldn't be executing any of my code. On the Mac builds, the build seems to fail with for node versions <= 6 because my new code uses std::regex.

Any ideas or suggestions on how I might resolve this? At this point I'm fairly tempted to just move this into its own package (that perhaps only works with more recent versions of node) or go back to maintaining a private fork...

Thanks in advance for any help!

@kewde
Copy link
Collaborator

kewde commented Dec 14, 2018

The createdb. js file is a pretest. It runs before the tests and set up the environment.
It's the first time the newly compiled sqlite3 lib is loaded in JS, it segfaulted and aborted.

@kewde
Copy link
Collaborator

kewde commented Jan 28, 2019

Still seeing core dumps..

terminate called after throwing an instance of 'std::regex_error'
  what():  regex_error
Aborted (core dumped)

@kewde kewde closed this Jan 28, 2019
@kewde kewde reopened this Jan 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants