Skip to content

gesiscss/css_methods_python

Repository files navigation

Introduction to Computational Social Science methods with Python

This repository contains a full introductory course to CSS methods with Python. Teaching materials meet the criteria of a gradable university course, are fully online, self-explanatory, and freely available: the materials combine coding tutorials with recommended readings, specific teaching lessons, and experience-based guidelines; they are housed, here, in a public GitHub repository, which means, everybody can study them; they have the form of Jupyter Notebooks, which means, they have the look and feel of a manuscript, yet, they contain Python code that is fully executable in a browser window, potentially without the need to locally install Python; and they are available under a Creative Commons license which allows you to freely share and adapt them. The course consists of sessions that gradually lead participants to acquire more skills in Python.

Syllabus

The course consists of four sections. The first section teaches how to set up a computing infrastructure and conveys basic data management and scientific computing skills. The second section teaches students how to collect data using dedicated Python packages for using Application Programming Interfaces (APIs) and web scraping. The third section focuses on data preprocessing methods from network analysis and NLP and includes applications of Large Language Models (LLMs). The fourth section is about data analysis methods and goes into depth with network analysis and modeling, unsupervised and supervised ML, as well as topic modeling. Some datasets are repeatedly used throughout the course, among them a corpus of tweets on the topic of COVID (TweetsCOV19) from May 2020, social networks from the Copenhagen Networks Study (CNS), and the Varieties of Democracy (V-Dem) dataset on countries and principles of democracy. Whenever possible, sessions are interlinked and built on top of each other.

Read the syllabus here.

Execution

Notebooks are developed for the Anaconda distribution 2022.10 which can be downloaded here. For a complete guide how to set up your computing infrastructure and execute the course materials locally or in the cloud, please consult Session A1: Computing infrastructure. Or click on this button and execute the materials in the Binder cloud:

Binder

Course structure

Section A: Introduction

Section B: Data collection methods

Section C: Data preprocessing methods

Section D: Data analysis methods

Permanent call for contribution

This course is initialized by 14 sessions, but more sessions can be added. For this, we invite contributions. Before starting to develop, you should contact us. When you develop, please use this template. Why should you contribute? Because science should be open, and we believe that this course has a future.

Who are we

This course is edited by Haiko Lietz, Sinemis Temel, M. Fuat Kina, and N. Gizem Bacaksizlar Turbic. Contact us here.

Acknowledgement

The initial 14 sessions have been developed as part of the Social ComQuant project which had been funded until 2023 as a twinning project among Koç University (Istanbul, Turkey), GESIS – Leibniz Institute for the Social Sciences (Cologne, Germany), and the ISI Foundation (Torino, Italy) under the European Commission's Horizon 2020 funding line.

About

A full course of self-explanatory and freely available materials on CSS methods

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published