marottes-medias-back

media_scrap.py

This script scraps stories available in the RSS feed of each medium, gets the title, the intro and the publication date, cleans and stores the data in separate JSON files.

media_analyse.py

This script prepares a corpus of scraped stories (headline and intro) and send it to TextRazor to perform named entities recognition

media_prepare_data.py

This script gathers entities and media data to create the JSON file loaded front-end to render the main dashboard

media_gather_files

This script opens all the daily JSON files with entities detected and gather data in a big file '3months.json" for each media. This new file is used by media_graph.py to create the line chart data.

media_graph.py

When requested, this script gathers and send the data available for a given entity. It is used front-end to draw the line chart in the dashboard when the user clicks on an entity.

media_cache.py

This script calls media_graph.py for each of the first n entities displayed in the dashboard. When they are clicked on by the user, the line chart loads faster. Yet this cache is not prepared for all entities to limit server costs.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitattributes		.gitattributes
README.md		README.md
media_analyse.py		media_analyse.py
media_cache.py		media_cache.py
media_gather_files.py		media_gather_files.py
media_graph.py		media_graph.py
media_prepare_data.py		media_prepare_data.py
media_scrap.py		media_scrap.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitattributes

.gitattributes

README.md

README.md

media_analyse.py

media_analyse.py

media_cache.py

media_cache.py

media_gather_files.py

media_gather_files.py

media_graph.py

media_graph.py

media_prepare_data.py

media_prepare_data.py

media_scrap.py

media_scrap.py

Repository files navigation

marottes-medias-back

media_scrap.py

media_analyse.py

media_prepare_data.py

media_gather_files

media_graph.py

media_cache.py

About

Releases

Packages

Languages

datafornews/marottes-medias-back

Folders and files

Latest commit

History

Repository files navigation

marottes-medias-back

media_scrap.py

media_analyse.py

media_prepare_data.py

media_gather_files

media_graph.py

media_cache.py

About

Resources

Stars

Watchers

Forks

Languages