Open source data platform for data scientists
OpenDATUM provides a light data platform allowing data scientists to directly navigate in datalakes. Datawarehouse and Datamarts are implicitly constructed and updated while navigating and exploring data sources.
We are at the beginning of the project. For know I can list the workload commitment we gathered:
- Myself: 800h of work on this project from April 2017 to April 2018.
- You?: ...
Data preparation easaly represents 75% of the time spent by datascientists while constructing an analysis. This part of the work is repetitive and hard to convert into "production ready" algorithms. We believe that a good part of it can be accelerated with an appropriated data platform. We believe that such a project shall be fully open source and managed as a community project.
My objective is to push forward a Liberal contribution model. However, if concensus is too hard to reach, jeopardizes the project progress and kills contribution dynamic, I may fall back to a model closer to Benevolent Dictator for Life
Liberal contribution: Under a liberal contribution model, the people who do the most work are recognized as most influential, but this is based on current work and not historic contributions. Major project decisions are made based on a consensus seeking process (discuss major grievances) rather than pure vote, and strive to include as many community perspectives as possible.
BDFL: BDFL stands for “Benevolent Dictator for Life”. Under this structure, one person (usually the initial author of the project) has final say on all major project decisions. Python is a classic example. Smaller projects are probably BDFL by default, because there are only one or two maintainers. A project that originated at a company might also fall into the BDFL category.
See CONTRIBUTING.md, Wiki for project technical documentation and code of conduct