Skip to content

brianspiering/bayesian-text

Repository files navigation

Naive Bayes for Text Classification

This is an introduction to using Naive Bayes for text classification. We will learn how to code Naive Bayes to classify text documents, such as whether a news article is "sports" or "business".

What You’ll Learn:

  • What is Bayes Theorem and why it's useful
  • How the Naive Bayes algorithm extends from Bayes Theorem
  • How to build a Naive Bayes algorithm to classify text
  • How to evaluate our classifier’s performance
  • Which resources to continue to develop your skills

We’ll be doing hands-on coding in Python. You’ll leave this session equipped to apply this classification technique to other text datasets.

Prerequisites:

This is intended for beginner to intermediate data science students. People who have done a little machine learning before and want to add Natural Language Processing to their data science toolbox. It does not require in-depth knowledge of statistics or probability.

If you have never seen Python before, never fear! Beginners are welcome to come to listen, learn, and observe. If you've already got some familiarity with Python, you’ll get more out the workshop! Participants who are familiar with concepts of data analysis, statistics, and probability will be better equipped to apply their skills after the conclusion of this meetup.

Setup

  • Binder Jupyter Notebooks in a Docker container
  • Colaboratory Jupyter Notebooks on Google Drive
  • Local setup via Anaconda

About Me

Dr. Brian Spiering is a Professor of Computer Science at the University of San Francisco and freelance consultant. He teaches humans the languages of computers (primarily Python) and teaches computers the languages of humans (through Natural Language Processing and Artificial Intelligence). He is active in the San Francisco tech community through volunteering and mentoring.

About

A hands-on workshop applying Naive Bayes to text classification

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published