Skip to content

Latest commit

 

History

History
55 lines (33 loc) · 1.64 KB

README.md

File metadata and controls

55 lines (33 loc) · 1.64 KB

hdo-transcript-search

Build Status Code Climate

Visualize language usage in the Norwegian parliament. See it in action at tale.holderdeord.no.

image

This project consists of two parts:

  • indexer/ - download and index Stortinget transcripts in ElasticSearch
  • webapp/ - web frontend to present / visualize the data

Running with docker-compose

$ docker-compose up -d es webapp
$ docker-compose run --rm indexer

Requirements

  • elasticsearch
  • node.js
  • ruby

indexer

Download and index transcripts (requires a local elasticsearch):

$ cd indexer/
$ gem install bundler
$ bundle install
$ bundle exec ruby -Ilib bin/hdo-transcript-indexer

Re-create the index. This is necessary when a mapping is changed:

$ bundle exec ruby -Ilib bin/hdo-transcript-indexer --create-index

Convert a single XML transcript to indexable JSON:

$ bundle exec ruby -Ilib bin/hdo-transcript-converter transcript.xml

webapp

Start the webapp in dev mode:

$ cd webapp
$ npm install
$ npm run dev
# open your browser at http://localhost:7575/

Caveats

  • Because of deficiencies in the transcripts, we don't know the correct time for all speeches. The "time" field will in these cases be set to midnight.