Scrap the statistics about RGS AARs

Installation

Create a conda environment with the following command.

conda env create --file environment.yml

Activate the environment to run code

Windows

activate scrap_websites

Linux

source activate scrap_websites

Update the data

Run the following command to scrap the RGS and Paradoxwebsite to get the new statistics about the AARs.

cd AAR
scrapy crawl rgs_aar -o ./Data/RGS_AAR.json
scrapy crawl paradox_aar -o ./Data/Paradox_AAR.json

Fix the file

In Data/RGS_AAR.json and Data/Paradox_AAR.json, search the line with ][, remove it and add a comma at the end of the previous line.

Write the top of replies and views between the 2 last scrapping

Run the following script to get top n threads according to new views and replies between the 2 latest scrapping dates.

python Analysis/Tops.py RGS_AAR.json <n>
python Analysis/Tops.py Paradox_AAR.json <n>

n is the size of the top (for example 10).

The results are written in the ARR/Results/RGS and ARR/Results/Paradox directories

Write the top of replies and views for the latest year

Run the following script to get top n threads according to new views and replies in latest year.

python Analysis/Tops.py RGS_AAR.json <n> --year
python Analysis/Tops.py Paradox_AAR.json <n> --year

n is the size of the top (for example 100).

The results are written in the ARR/Results/RGS and ARR/Results/Paradox directories

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
AAR		AAR
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AAR

AAR

.gitignore

.gitignore

README.md

README.md

environment.yml

environment.yml

Repository files navigation

Scrap the statistics about RGS AARs

Installation

Activate the environment to run code

Windows

Linux

Update the data

Fix the file

Write the top of replies and views between the 2 last scrapping

Write the top of replies and views for the latest year

About

Releases

Packages

Languages

NicolasGrosjean/ScrapWebsites

Folders and files

Latest commit

History

Repository files navigation

Scrap the statistics about RGS AARs

Installation

Activate the environment to run code

Windows

Linux

Update the data

Fix the file

Write the top of replies and views between the 2 last scrapping

Write the top of replies and views for the latest year

About

Topics

Resources

Stars

Watchers

Forks

Languages