Update README.md

fronchetti · Aug 10, 2022 · 62930cb · 62930cb
1 parent 65ea210
commit 62930cb
Showing 1 changed file with 1 addition and 0 deletions.
diff --git a/scripts/README.md b/scripts/README.md
@@ -3,6 +3,7 @@
 This is probably the most complex folder of all the repository, so I will try to be as detailed as possible.
 
 This folder is organized as follows:
+- If you want to run the code available in this folder, start by installing all the Python dependencies using [PIP](https://pypi.org/project/pip/) and the `requirements.txt` file (On terminal, your command should look like this: `pip install -r requirements.txt`). 
 - If you are looking for how we extracted documentation data from GitHub, you should look at the `scraper` folder. The `api_scraper.py` file is the main file of this folder, containing the code that requests custom URLs to GitHub API. The file `main.py` presents the whole process of extracting a documentation file, `scrapy.py` shows how to do the URL requets to the `api_scraper.py` module and `validate.py` shows how we validated if a documentation file was valid for qualitative analysis or not. If you want to know how we converted the markdown files to spreadsheets, take a look at `export.py` (Please noticed that we use cmark-gfm to convert the markdown content to plaintext and, if you want to run it, you will need to build cmark-gfm on your computer). More information about all these files are given in doctstrings.
 - Inside the `classifier` folder you will find how we performed all the classification steps until getting a final model. The subfolders are supposed to as intuitive as possible. The `data_preparation` folder, contains the code about how we prepared data for classification, the `model_selection` folder about how we selected the best estimator for our problem, the `results_report` should contain scripts used to report our final model, and the `classification` folder contains the code used to perform classification. If you want to understand the whole process, I recommend starting with the `main.py` file, where I tried to split in clear methods the stages of this process.