Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pandas Notebooks #29

Open
4 tasks
cc7768 opened this issue Nov 17, 2016 · 0 comments
Open
4 tasks

Pandas Notebooks #29

cc7768 opened this issue Nov 17, 2016 · 0 comments

Comments

@cc7768
Copy link
Member

cc7768 commented Nov 17, 2016

Some useful thoughts from @mwaugh0328.

  • Start the notebooks with a conceptual and applied question we're going to answer. For example, for pandas cleaning the conceptual question could be, "What happens when the data you read in isn't in the format you want it? The applied version of that question would be, "Your boss wants you to do X with a certain dataset, but the dataset is all screwed up. He only cares about the answer and you have three hours to give him an answer... How do you cleanup the data so that you can give him an answer"
  • Less datasets per notebook. It gets really confusing to be going back and forth between different datasets. If we want to highlight different things (that might not all appear in an original dataset) then take the dataset and add them. For example, if we wanted to talk about missing values and only wanted to use the Chipotle data then we could just break that dataset by adding missing values.
  • More exercises. The per student realization of attention payed follows a random walk and so the variance increases at root t. The exercises play the part of an "ss rule" and draw everyone back to the center (except for students who have landed in the absorbing boundary of facebook)
  • Related to the previous two boxes: The first exercise of a class might be to explore a dataset. Have the students figure out what it contains, what they might want to do to it, and what questions we might want to ask of the data. This allows them to get an idea of what is in the dataset and helps students be able to start paying attention again even after they had previously drifted off because they will always know what data we are working with.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant