Skip to content
#

multinoulli-distribution

Here are 2 public repositories matching this topic...

Used data of emails being spam or non-spam for performing text classification using different probability distributions. Used NLTK library to remove stop words, non-alphabetic characters, and for tokenizing the text. Calculated mean and variance and other params for each word based on the label(spam or ham).

  • Updated Dec 5, 2023
  • Jupyter Notebook

Used data of emails being spam or non-spam for performing text classification using different probability distributions. Used NLTK library to remove stop words, non-alphabetic characters, and for tokenizing the text. Calculated mean and variance and other params for each word based on the label(spam or ham).

  • Updated Dec 5, 2023
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the multinoulli-distribution topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multinoulli-distribution topic, visit your repo's landing page and select "manage topics."

Learn more