#

news-crawler

Here are 30 public repositories matching this topic...

news-please

fhamborg / news-please

news-please - an integrated web crawler and information extractor for news that just works

Updated May 15, 2024
Python

adbar / trafilatura

Python & command-line tool to gather text on the Web: web crawling/scraping, extraction of text, metadata, comments

Updated May 24, 2024
Python

lumyjuwon / KoreaNewsCrawler

A korean news crawler built to ingest large amounts of news data.

crawler scrapy-crawler news-crawler newscrawler koreanewscrawler

Updated Apr 27, 2024
Python

flairNLP / fundus

A very simple news crawler with a funny name

python nlp rss sitemap crawler scraper corpus text-extraction web-scraping news-crawler commoncrawl web-corpus news-scraping cc-news

Updated May 23, 2024
Python

google-news-scraper

lewisdonovan / google-news-scraper

Lightweight scraper for Google News

crawler news web-crawler web-scraper news-articles news-crawler google-news google-news-scraper news-scraper google-crawler

Updated May 13, 2024
JavaScript

LuChang-CS / news-crawler

A news crawler for BBC News, Reuters and New York Times.

crawler bbc reuters news-crawler nytimes

Updated Dec 8, 2022
Python

stardust95 / NewsFeeds

Newsfeeds website using nodejs as server and mongo as storage backends, including a simple recommendation system. 基于Node.js的新闻聚合网站, 支持基于用户行为推荐新闻.

nodejs mongodb recommendation-system newsfeed microsoft-cognitive-services news-crawler

Updated Sep 7, 2020
HTML

divkakwani / webcorpus

Generate large textual corpora for almost any language by crawling the web

multilingual nlp datasets news-crawler indic-languages nlp-datasets

Updated Nov 18, 2021
Python

nploi / news_crawler

News crawler là một công cụ giúp bạn có thể crawl dữ liệu của một trang tin tức.

python crawler news scrapy news-crawler vietnam-crawl

Updated Oct 24, 2019
Python

atulyakumar97 / news-sentiment-analysis

The spider crawls moneycontrol.com and economictimes.com to fetch news of input companies and also scores and classifies the companies to raise an early warning signal

python crawler spider sentiment-analysis webscraper desktop-application classification scrapy data-analysis pyinstaller ews news-crawler early-warning-systems news-sentiment-analysis

Updated Aug 11, 2019

MoritzGoeckel / NodeJsNewsCrawler

📰 Search engine for news in NodesJS

nodejs redis analysis rest-api cheerio facebook-bot expressjs facebook-graph-api web-frontend news-sources facebook-page news-crawler headline newsbot

Updated May 23, 2017
JavaScript

SecondDim / crawler-news

Use python scrapy build crawler for real-time Taiwan NEWS website.

mysql python docker crawler circleci news database docker-compose taiwan scrapy news-crawler python-scrapy taiwan-news-website

Updated May 23, 2023
Python

johnbumgarner / newshound

This Python package can be used to systematically extract multiple data elements (e.g., title, keywords, text) from news sources around the world in over 50 languages.

data-science text-mining data-mining news news-aggregator python3 datascience web-scraping data-extraction webscraping news-crawler article-extracting article-extractor newspaper-crawler python-newspaper

Updated Mar 14, 2023

karimhabush / TheguardianScrapper

A Scrapy webscraper that can scrape and store articles of theguardian.com

scraper scrapy news-crawler

Updated May 1, 2023
Python

siristechnology / news-crawler

Config based news crawler using Google Puppeteer

javascript chromium news-crawler puppeteer

Updated May 15, 2021
JavaScript

thinh-vu / vnnews

A Python package that helps capture news updates from top Vietnamese news sites

sentiment-analysis investment news-crawler vietnamese-language

Updated Apr 22, 2023
Jupyter Notebook

santhoshse7en / Alcoholics-Anonymous

Research Project to analyse the knowledge about Alcoholics Anonymous in public

python crawler web-scraping anonymous bs4 news-crawler data-extraction-and-pre-processing google-search-using-python the-hindu without-api aa-meetings newspaper3k alcoholics alcoholics-anonymous

Updated Nov 23, 2019
Jupyter Notebook

AndyTheFactory / article-extraction-dataset

Article title, authors, date and body extraction dataset.

text-mining news html-to-markdown scraping corpus news-aggregator text-extraction dataset web-scraping readability datasets scraping-websites html2text news-crawler corpus-builder corpus-tools article-extractor text-cleaning text-preprocessing

Updated Mar 26, 2024
HTML

teanaps-web-scraper

fingeredman / teanaps-web-scraper

텍스트 분석용 데이터 수집을 위한 웹스크래핑 도구를 제공합니다.

python text-mining data-mining web-crawler selenium requests web-scraping beautifulsoup appstore playstore web-crawling news-crawler web-crawler-python naver-cafe naver-news news-scrapper naver-movie teanaps news-scrapping

Updated Sep 18, 2022
Jupyter Notebook

luhuadong / newscraper

🐞 A general news information crawler.

python crawler scrapy news-crawler news-scraper

Updated Mar 21, 2024

Improve this page

Add a description, image, and links to the news-crawler topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the news-crawler topic, visit your repo's landing page and select "manage topics."