Skip to content

Frequent Phrase Extraction : This module extracts the most common occurring phrases in the corpus. It is based on the NLP rule based extraction. All the corpus processing done is out of the main memory.

License

Notifications You must be signed in to change notification settings

yardstick17/extract_phrase

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

build-pass Join the chat at https://gitter.im/yardstick17/extract_phrase

How to get it:

Setup:
#cd to root directory
bash setup.sh
Command to use:
PYTHONPATH='.' python nlp/main.py --input_file test_input.txt --output_file=test_output.txt
> [2017-07-07 08:34:19,275] INFO : Evaluating file: test_input.txt for extracting frequent tags
> [2017-07-07 08:34:19,473] INFO : Got total 87 frequent phrases.
> [2017-07-07 08:34:19,473] INFO : Frequent phrases:[('mushroom duplex', 4), ('bar exchange', 4), ('vapou bar grill', 3), ('list serving great food', 3), ('palak patta chaat', 3)]
> [2017-07-07 08:40:20,486] INFO : Output file: test_output.txt is written with most frequent phrases updated

About

Frequent Phrase Extraction : This module extracts the most common occurring phrases in the corpus. It is based on the NLP rule based extraction. All the corpus processing done is out of the main memory.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published