Skip to content

anglimin/honours_thesis_code

Repository files navigation

GE4401: Honours Thesis

Title: Identifying Discrepancies within current transport initiatives: An Urban Informatics Approach

This repository contains the scripts that I have used for my methodology.
In sequential order, the scripts should be executed as follow:

  1. reddit.py and twitter.py
  2. topic_modelling.py
  3. Visualisation_Stationarity.py

Following diagram showcases the methodology flow (for pure scripting), as well as, the output and their file names after each run.
github diagram

Disclaimer 1: Raw data will not be supplemented in this repository to prevent breach of privacy. Refer to Appendix C to understand data schema of the raw and processed data.

Disclaimer 2: reddit.py and twitter.py contain environmental variables that users need to change on their own end.

Disclaimer 3: I have also attached the script (reddit_locations.py) for identifying locations with Reddit commments through spaCy Named Entity Recognition (NER). However, these results are not utilised for subsequent analysis as posited in section 3.7. Moreover, the trained NER model (spacy_sg) may not be the best identifier of Singapore's locations in lieu of vaa myriad of reasons that are not within the scope of this thesis.

Initialising the package folder and dependencies.

  1. Git clone the entire package using whatever CLI you are comfortable with
  2. pip install -r requirements.txt
  3. Run the scripts sequentially

About

Scripts that I have used for my Honours Thesis

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages