Many countries speak Arabic; however, each country has its own dialect, the aim of this project is to build a model that predicts the dialect given the text.
-
Updated
May 28, 2024 - Jupyter Notebook
Many countries speak Arabic; however, each country has its own dialect, the aim of this project is to build a model that predicts the dialect given the text.
Using text analytics to understand cultural patterns in philosophical texts. Exploring gender, author, region, and time-period differences, and extracting key philosophical concepts.
Predict emotions (happiness, anger, sadness) from WhatsApp chat data using machine learning and deep learning models. Includes text normalization, vectorization (TF-IDF, BoW, Word2Vec, GloVe), and model evaluation.
Information Retrieval is the process of accessing relevant information from data sources using techniques like indexing and ranking. It is crucial for search engines, databases, and digital libraries for efficient information access.
Data and code for the machine learning exam assignment of MA Digital Text Analysis (2023).
This is a book recommendation system based on the book rating data from GoodReads_100k dataset. The dataset contains 100k book.
Implemented online learning algorithms which enable model updates with new data without full retraining.
Implement of Term Frequency-Inverse Document Frequency (TF-IDF) algorithm from scratch in two different ways, accompanied by text generation methods. TF-IDF is a widely used technique in natural language processing and information retrieval to represent the importance of a term within a document relative to a collection of documents.
"Is it about money?" Prompt-powered ML based money sentiment detection.
ML model for spam detection using Naive Bayes & TF-IDF. Achieved 0.98 accuracy. Utilized Scikit-learn, Numpy, nltk. Implements NLP concepts. Explore precise spam classification effortlessly. #MachineLearning #SpamDetection 🚀✉️📱
Movie Review Classification with TF-IDF Vectorize & SVC given a set of text movie reviews that have been labeled negative or positive
Various projects employing a multitude of natural language processing techniques to generate insight. Tokenization, stemming, a TF_IDF vectorizer, sentiment analysis, latent Dirichlet allocation
A simple Django-based resume ranker website where recruiters post their jobs and candidates applies for their desired vacancies. The system gets the document similarity between the job description and the candidate resumes, generates similarity scores using the KNN model, and rank or shortlist the candidate resumes.
Implementation & analysis of various NLP techniques in Python: 4 projects on tokenization, text classification, sequence labeling, and more
Building a basic spam classifier with Tf-IDF Vectorizer and Naïve Bayes model
A spam classifier is a software or machine learning model that categorizes incoming messages or content as either "spam" (unwanted or irrelevant) or "ham" (legitimate or relevant), using automated techniques.
a stacked LSTM to categorize textual news feeds
Add a description, image, and links to the tf-idf-vectorizer topic page so that developers can more easily learn about it.
To associate your repository with the tf-idf-vectorizer topic, visit your repo's landing page and select "manage topics."