Project work as part of the E0-334 Deep Learning for Natural Language Processing course at IISc, Bengaluru. We had proposed a graph-based model for text classification.
-
Updated
Sep 28, 2021 - Python
Project work as part of the E0-334 Deep Learning for Natural Language Processing course at IISc, Bengaluru. We had proposed a graph-based model for text classification.
Clean corpus generic script made with tm package
This repository contains notebooks which explores the tsne algorithm by applying it on various datasets
Naive Bayes classifier and boolean retrieval done on the 20Newsgroups dataset that has been written from scratch. Extremely lightweight and produces decent results. Also currently working on classification using word embeddings.
Classified human and machine generated text using 1) a single score threshold classifier and 2) a neural network classifier approach, based on perplexities and probability scores generated from n-grams. Best results are 77% for the single score classifier and 80% for the ANN classifier.
This project offers advanced techniques in text preprocessing, word embeddings, and text classification. Explore methods like Word2Vec and GloVe, and master Multinomial Naive Bayes for accurate predictions. Dive into the world of text clustering and conquer challenges like unbalanced data.
Kmeans and SOM clustering for 20newsgroup
Assignment 2 – Dimensionality reduction and text classification: converted news text into a machine readable representation, reduced the dimensions of the text representation and trained classifiers to decide which of 20 news groups a sample belongs to.
NLP Topic Modeling Techniques (LDA, LSA & BERTopic)
Implemented Naive Bayes text classifier for the 20newsgroups dataset
FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data. Flexible EM-Inspired Discriminant Analysis is a robust supervised classification algorithm that performs well in noisy and contaminated datasets.
In this project we will generate the sentences using ngrams
This repository contains code for our project work as part of the E0-334 Deep Learning for Natural Language Processing course at IISc, Bengaluru. We had proposed a graph-based model for text classification.
Cluster labelling was done by using the power of wikipedia search
Parse, Compute Pairwaise Similarity Matrices, Train and Test using KNN Classification Algorithm
Experimentation with novelty detection
Some hidden knowledge found in the 20 Newsgroups dataset
Add a description, image, and links to the 20newsgroup topic page so that developers can more easily learn about it.
To associate your repository with the 20newsgroup topic, visit your repo's landing page and select "manage topics."