Skip to content

rafaeelaudibert/UFRGS_scraper.js

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

UFRGS - Vestibular Scraper

Scraper written in JavaScript, using Node.js to fetch all the freshmen in UFRGS vestibular.

There is also a Shell scraper with less functionalities but a lot faster

This code is tested to run in UFRGS's "Listão" from the 2022 and 2023 editions. There are no warranties that it will run in future editions, as this is only a scraper and depends in the website layout, which can be changed by UFRGS at any time.

NOTE: A previous version worked for the years between 2016 and 2021, but that version stopped working recently You might check it by looking at previous commits


Configuring

You must have Node.js installed in your computer to run this code. You can download it here.

You can clone this repository running git clone https://github.com/rafaeelaudibert/UFRGS_scraper.js.git && cd UFRGS_scraper.

After, you need to install the requirements, which can be easily installed with npm install.

You should also configure the year you want to be searched in the .ENV file, writing a key/value pair, such as YEAR=2023.


Running the code

To run the code you can simply run npm start.

The code will erase any folder with the name ./json in the root of the project, so be sure to not have important information in it before running the code and typing YES when prompted.


Understanding the data

The data generated by the code is pretty easy to understand. It will generate a folder tree like so:

./json
  |
  \- course1
      |
      \- freshmen.json
      \- freshmen.txt
  |- course2
  |- course3
  |- course4


Each course will have its own folder containing 2 files: freshmen.json and freshmen.txt. The former has the following structure:

[
    {
        "name": "First freshman name",
        "semester": "First freshman semester (1º or 2º)",
    },
    {
        "name": "Second freshman name",
        "semester": "Second freshman semester (1º or 2º)",
    },
    {
        ...
    },
    ...
]


The latter is a plain text file containing one freshman name per line, without the semester, as follows:

    First freshman name
    Second freshman name
    Third freshman name
    ...

Disclaimer

This program is not associated with the Universidade Federal do Rio Grande do Sul in any ways, and it was just created to more easily fetch the freshmen through the popular Listão do Vestibular.

Releases

No releases published

Packages

No packages published