csvParser

./run.sh in project root will run python csvParse.py ./csvParse.config. The steps set in the processing_steps part of the config file are run in order. In all cases the values of each step in processing_steps can either be a list or a single target (ex. 'col1' or ['col1','col2']). Running this project will create a sqlite db in the root directory called testDb.db.

ipython notebook

Also included is an ipython notebook which can also be viewed here

dependencies

I recommend using the latest version of the Anaconda python3 distribution

pandas
sqlalchemy
json

future

Comments in code
Currently headers aren't stripped of whitespace before being compared to config, it would be nice to fix this
More datatypes (ex datetime)
Seperate out loading, processing, and storing into seperate classes for better modularity
More input types (ex load from db)
More output types (ex csv)
More transform types (ex round)
Save transformed cols as new cols instead of overwriting (ex col3_float)
Better error handling

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.gitignore		.gitignore
README.md		README.md
csvParse.config		csvParse.config
csvParse.py		csvParse.py
csvParser.ipynb		csvParser.ipynb
run.sh		run.sh
testData.csv		testData.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

README.md

README.md

csvParse.config

csvParse.config

csvParse.py

csvParse.py

csvParser.ipynb

csvParser.ipynb

run.sh

run.sh

testData.csv

testData.csv

Repository files navigation

csvParser

ipython notebook

dependencies

future

About

Releases

Packages

Languages

bgrayburn/csvParser

Folders and files

Latest commit

History

Repository files navigation

csvParser

ipython notebook

dependencies

future

About

Resources

Stars

Watchers

Forks

Languages