Classifying-20-news-group-dataset-using-support-vector-machine and NLP

This is the code for classifying 20 news group dataset using support vector machine and natural language processing.

Overview

Basically we create a data clean function using nltk which removes non-alpha words (like abc1234) or characters, punctuatons and popular names(like John, James using nltk.corpus.names). And then each word is lemmatized ({close, closely, closed, closer} => close ) using WordNetLemmatizer.

Dependencies

nltk
sklearn
numpy

Usage

Just run the given jupyter notebook in your browser.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
Support+Vector+Machine.ipynb		Support+Vector+Machine.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Support+Vector+Machine.ipynb

Support+Vector+Machine.ipynb

Repository files navigation

Classifying-20-news-group-dataset-using-support-vector-machine and NLP

Overview

Dependencies

Usage

About

Releases

Packages

Languages

Rohit9314/Classifying-20-news-group-dataset-using-support-vector-machine

Folders and files

Latest commit

History

README.md

README.md

Support+Vector+Machine.ipynb

Support+Vector+Machine.ipynb

Repository files navigation

Classifying-20-news-group-dataset-using-support-vector-machine and NLP

Overview

Dependencies

Usage

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages