Skip to content

mishal23/spell-check

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spell-Checker

  • Inspired by the spelling checker in various Search-Engines, Office Packages and many more, here is an attempt to implement spelling-corrector in Erlang.
  • Norvig(Director of Research at Google Inc) in 2007 had released the Toy Spelling Corrector in Python(only 21 lines),achieving 80 or 90% accuracy at a processing speed of at least 10 words per second in about half a page of code.
  • He had released it after his two friends Dean and Bill were amazed at Google's spelling correction and did not have good intuitions about how the process works,though being highly accomplished engineers and mathematicians.

Implementation

  • It takes reference of words from big.txt which has about a million words(The same was used by Norvig in his implementation of Spell-Corrector).
  • All the words of the file big.txt are splitted and saved as a list.
  • New list is formed with various edits from the 4 functions( deletion_edits, transposition_edits, alteration_edits, insertion_edits).
  • After which list is filtered by comparing the words of list formed by big.txt and the list formed by various edits, and returns a list with the similarities found.

Steps to Run

  • Clone the repository after forking it and then head to the Erlang Shell.
  • Change the directory to cloned repository.
  • Compile it.
  • Input a word in double quotes and check the recommendations given.
  • For my system after heading to Erlang Shell, it is as follows
1> cd("C:/Users/Mishal Shah/Desktop/Erlang").
C:/Users/Mishal Shah/Desktop/Erlang
ok
2> c(check).                                  
{ok,check}
3> check:known("helo").
Did you mean?
["felo","halo","held","hell","hello","helm","help","hero"]
4>  check:known("seach").
Did you mean?
["beach","each","reach","search","teach"]
5>  check:known("somthing").
Did you mean?
["something","soothing"]

Timer

  • The time noted is the average of 6 outputs of timer function.
Word 3rd Release time(in seconds) 2nd Release time(in seconds) 1st Release time(in seconds)
somthing 3.09 4.525 13.3
seach 3.05 4.46 9.5
helo 2.8 4.5 8.2

Future Scope

  • Work on run-time speed.
  • Work on increasing accuracy.
  • Work on spell-checker in more than one word.

License

FootNotes

  • Norvig's original post here

Releases

No releases published

Packages

No packages published

Languages