Skip to content

direct-assemblee/DirectAssemblee-scraper

Repository files navigation

Direct Assemblée - server side : the scraper

This application retrieves public data from the Assemblée Nationale website :

  • the deputies and their works : written questions, reports, law proposals, participations to commissions and public sessions
  • the current mandate ballots, and all deputies votes

The data is stored in a MySQL database , and can be served with the api application.

Setup

Building the code

  1. Install the development environment :
npm install -g sails
  1. Clone the repository:
git clone <URL>
  1. Install the project dependencies:
npm install
  1. This project uses Mailgun to send mails when a new theme needs to be treated. See Mailgun section below to configure project.

  2. Create the database directassemblee on your local MySQL server.

  3. Launch the scraper . You should use -t argument to specify what you want to scrap :

  • Deputies :
sails lift -t deputies
  • Ballots :
sails lift -t ballots

MySQL issues

When you get ER_ACCESS_DENIED_ERROR: Access denied for user 'root'@'localhost'.

export DATABASE_HOST="localhost"
export DATABASE_PORT=3306
export DATABASE_USER="root"
# Careful, if you use root, both MUST be set to the correct value
export DATABASE_PASSWORD=""
export DATABASE_ROOT_PASSWORD=""
export DATABASE_NAME="directassemblee"

Mailgun

This project uses Mailgun.

Works and ballots are associated to a theme. However, there is no static theme list that can be found on the Assemblée Nationale website. We need to maintain our own theme list in the database : Theme table. We also have a ShortTheme table that we use to shorten the long themes in order to have them display nicely on a mobile device.

Everytime a new theme is scraped, the app sends us a new mail so we can manually add it to our database.

If you want to keep this feature, you should register your own Mailgun account and generate a domain and an api key that you will store in config/env/production.js and config/env/development.js

module.exports = {
  mail: {
    apiKey: 'key-xxx',
    domain: 'sandboxxxx.mailgun.org',
    receiver: 'xxx@xxx.xxx'
  }
};

Contribute

Pull requests are more than welcome ! If you want to do it, use a feature branch and please make sure to use a descriptive title and description for your pull request.

Unfortunately, the project doesn't have any unit tests yet, so it can take a while to make sure there is no regression in your pull request.

License

Direct Assemblée scraper is under the AGPLv3.

See LICENSE for more license info.

Contact

For any question or if you need help, you can send contact us at contact@direct-assemblee.org.