I work at the intersection of computational social science and journalism to tell stories with the help of data. My research is oriented to topics related to natural language processing applied to investigative journalism.
- 💻 Data analysis at The Examination
- 🚀 NLP enthusiast
- 📌 Always interested in collaborating on data-driven projects
- 📫 How to reach me: feraguirre@riseup.net
Repository | Description |
---|---|
ai4foia | Proof-of-concept to recommend recipients for FOIA requests |
hackathon-somos-nlp-2023 | Fine-tuning LLMs for detecting hate speech categories in Spanish |
customized-headlines | Proof-of-concept to create customized headlines from news content based on demographic data |
explained-recommendations | API for a system recommendation explained using generative AI |
opportunities-db | Scraper to extract data from opportunity-related websites (e.g. funds, scholarships, etc.) and convert them into structured data |
cij-argentina | Scraper to convert PDF files from the CIJ website in Argentina into structured data |
pdf-2-ner | Web application to convert scanned PDF files to text-based data and apply Named Entity Recognition (NER) to extract entities in Spanish |
ner-spanish | A repository for extracting Named Entity Recognition (NER) in Spanish data |
pmdm | Fine-tuned pre-trained language model that detects hate speech against women in Spanish and Portuguese |
attackdetector | Research for hate speech on Twitter against journalists and environmental activists in Mexico and Brazil |
topicos-discursos-amlo | Analysis with topic modeling to AMLO's speeches |
bad-bunny | Analysis of Bad Bunny's songs |
Repository | Description |
---|---|
travesticidios-argentina | Data analysis on court decisions on transvesticides in Argentina from 2018 to 2023 |
elecciones-argentina-2023 | Data analysis of attacks against journalists in X during the elections in Argentina in 2023 |
recomendaciones-escritoras | Recommendation system for Latin American women writers |
cancilleria-colombia | Data analysis of public servants of Foreign Affairs in Colombia |
gptzero-ai-articles | Data analysis of articles talking about ChatGPT that were created with generative AI models |
capir-transfronteriza2-2023 | Data analysis and topic modeling of anti-rights groups from Brazil, Ecuador and Colombia |
migrantes-desaparecidos-eeu | Data analysis on missing migrants en route to the U.S. |
covid19-venezuela | Data analysis on covid-19 deaths in Venezuela |
comision-revision-bolivia | Map showing the rate of femicides in Bolivia per 100,000 women from 2013 to 2020 |
escritoras-latinas | Web scraping of Wikipedia entries for Latin American women writers and network graph visualization |
violencia-obstetrica-cuba | Data analysis of obstetric violence in Cuba |
wifi-gratuito-cdmx | Map showing locations of public free internet service in Mexico City |
Repository | Description |
---|---|
cookiecutter-data-analysis | My personal cookiecutter template for data analysis |
cookiecutter-data-journalism | A cookiecutter template for data journalism projects using Python |
Repository | Description |
---|---|
taller-cookiecutter | Workshop on how to create project templates for data analysis |
taller-python | Jupyter notebooks for learning the basics of Python |
maps-examples | Collection of maps examples |
twitter-python | Examples for Twitter data collection with Tweepy in Python [ARCHIVED] |
learn-python | Collection of Python scripts organized by topics |
learn-react-d3 | Examples for data visualization with React and D3.js |
learn-scrollama | Examples for scrollytelling with scrollama |
Repository | Description |
---|---|
data-annotator | Web application for text-based data labeling [ARCHIVED] |
mapa-huertos | Map with locations of urban orchards in Mexico City [ARCHIVED] |
directorix-disidente | Digital directory of professions to build networks among the queer community of Mexico City [ARCHIVED] |