slides https://docs.google.com/presentation/d/111MeIrf5WzlFxnRDhxTD6UuAz3kCEXXrKQhGHL6A4Is/edit?usp=sharing
email: lucia.chen AT stanford.edu
Twitter/Youtube: ML_made_simple
If you are from a social science background and want to learn programming and data science step by step, this is the right place :) Previous programming knowledge is not required.
Please make sure to install Python, Jupyter notebook, Anaconda (optional) and the required packages before you come to the workshop.
Setting up your environment
- NLTK (NLP)
- pandas (process tabular data)
- emoji
- matplotlib (for making plots)
- wordcloud
- jieba (for Chinese word segmentation)
clean text data, extract summary statistics from text and build a word cloud by yourself. This workshop will consist of the following four modules:
- Module 1 – Common Data Structures in Python
- Module 2 – Preprocessing Text Data
- Module 3 – Building a Word Cloud
- Module 4 – Summary Statistics with Pandas