Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ch5: Feedback #533

Open
g4brielvs opened this issue Dec 8, 2020 · 0 comments
Open

Ch5: Feedback #533

g4brielvs opened this issue Dec 8, 2020 · 0 comments

Comments

@g4brielvs
Copy link
Member

g4brielvs commented Dec 8, 2020

We are thankful for the opportunity to share our feedback as part of the final review (#476) and we appreciate the effort DIME is putting in disseminating these valuable guidelines and resources.

Here are some ideas, especially coming from the angle of the Data Partnership. I'd be more than happy to collaborate.

Ideas

  • The chapter focuses on a project's most time-consuming phrase - data preparation - and it offers good recommendations in that regard, considering the intended audience of Stata and R users. However, as datasets and research challenges become complex in nature, even if the majority of concepts on the book are still valid, it opens a new realm. For example, running into performance issues, out-of-memory, running on the cloud or a distributed cluster.
  • Probably out of scope, but it would be great to have a section on cloud computational environments and resources, such as JupyterHub, AWS Sagemaker or Google Colab.
  • Probably out of scope, but Python is a dispensable part of a modern analytics stack and there are considerations that might be useful when using Python or, more specifically, working on a data science project.
  • Probably out of scope, same goes for containerization with Docker.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants