Skip to content

Latest commit

 

History

History
24 lines (16 loc) · 1.07 KB

File metadata and controls

24 lines (16 loc) · 1.07 KB

Document Categorization using graph structuring

The paper is available at here

This projects explores a document categorization using graphical approach. Data used in this project can be found here Download data here

Dataset description

Consists of 2225 documents from the BBC news website corresponding to stories in five topical areas from 2004-2005. Class Labels: 5 (business, entertainment, politics, sport, tech)

Text Files

There are five text files - basic_tech.txt,basic_business.txt,basic_poltics.txt,basic_sports.txt

They contain the most five most important words for each class

This is the flowchart

Usage

In order to run the code on test set....

  1. load the files in load_all manully in spyder. PS.We are trying to provide a better method 😛