Ch5: Feedback #533

g4brielvs · 2020-12-08T02:02:30Z

We are thankful for the opportunity to share our feedback as part of the final review (#476) and we appreciate the effort DIME is putting in disseminating these valuable guidelines and resources.

Here are some ideas, especially coming from the angle of the Data Partnership. I'd be more than happy to collaborate.

Ideas

The chapter focuses on a project's most time-consuming phrase - data preparation - and it offers good recommendations in that regard, considering the intended audience of Stata and R users. However, as datasets and research challenges become complex in nature, even if the majority of concepts on the book are still valid, it opens a new realm. For example, running into performance issues, out-of-memory, running on the cloud or a distributed cluster.
Probably out of scope, but it would be great to have a section on cloud computational environments and resources, such as JupyterHub, AWS Sagemaker or Google Colab.
Probably out of scope, but Python is a dispensable part of a modern analytics stack and there are considerations that might be useful when using Python or, more specifically, working on a data science project.
Probably out of scope, same goes for containerization with Docker.

luizaandrade added the final review label Dec 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ch5: Feedback #533

Ch5: Feedback #533

g4brielvs commented Dec 8, 2020 •

edited

Ch5: Feedback #533

Ch5: Feedback #533

Comments

g4brielvs commented Dec 8, 2020 • edited

Ideas

g4brielvs commented Dec 8, 2020 •

edited