Wordcorrector

Python module to find closest matching words to a given input based on a dictionary of words.

Note : Currently used dictionary was generated from movie reviews data.

Features

Model can be generated to correct using any list of words (english words, names, places, products, you name it)
Fast, simple and clean
Uses fuzzy matching method
Takes into consideration the relative frequency of usage of words
Number of suggestions can be varied

Usage

Install the dependencies (given below)
Run word_corrector.py
Input a word when prompted. The program will return and display a list of top matches.

Building a new dataset

Currently, the dictionary of words has been built using NLTK's movie review data.
To use another list of words, make a JSON file in process/source/ folder in the following format:

  {
    "word1" : 1,
    "word2" : 2
  }

i.e. list of key-value pair where word will be key and its frequency/importance will be the value. Order does not matter. See the movie_review_data.json file for reference. 3. Open process/make_model.py, specify the name of above made source file in main function and run the file. 4. The word_corrector program will now use the new dataset.

Dependencies

Both python 2 and python 3 are supported.
numpy : http://docs.scipy.org/doc/numpy-1.10.1/user/install.html
sklearn : Not required to run the program. Needed only if you want to generate new dataset model.
http://scikit-learn.org/stable/install.html

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
process		process
.gitignore		.gitignore
README.md		README.md
stringprocessing.py		stringprocessing.py
version.py		version.py
word_corrector.py		word_corrector.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

process

process

.gitignore

.gitignore

README.md

README.md

stringprocessing.py

stringprocessing.py

version.py

version.py

word_corrector.py

word_corrector.py

Repository files navigation

Wordcorrector

Features

Usage

Building a new dataset

Dependencies

About

Releases

Packages

Languages

nitish6174/wordcorrector

Folders and files

Latest commit

History

Repository files navigation

Wordcorrector

Features

Usage

Building a new dataset

Dependencies

About

Topics

Resources

Stars

Watchers

Forks

Languages