Skip to content

Rajspeaks/Machine-Learning-approach-to-Bengali-POS-Tagging-using-NLTK

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 

Repository files navigation

Machine Learning approach to Bengali POS Tagging using NLTK on Indian-Corpus

Indian corpus is a collection of these Indian Languages: Bengali, Hindi, Marathi, and Telugu language data. NLTK is Natural Language Toolkit Library.

Methodology

  • Here I have imported NLTK(Natural Language Tool Kit).
  • Imported indian corpus from NLTK.
  • Stored that Indian Corpus into 'bangla.pos'.
  • 'bangla.pos' has been stored in a variable 'tagged_set'.
  • Stored the bengali sentences from bengali corpus into 'word_set' variable.
  • Using for loop to count the number of sentences, present in that corpus.

Tools & Library requirements:

  • Google Colab/Jupyter
  • Language: Python
  • NLTK Library

Mentor:

Prof. Sandipan Ganguly

Developer:

Rajdeep Das

Reference:

Click here to read the source article.