A search engine / data processor on two datasets from ir-datasets.com
-
Updated
Jun 3, 2024 - Jupyter Notebook
A search engine / data processor on two datasets from ir-datasets.com
This repository contains deep learning projects. The code for each project is provided, and the explanations can be found in the ReadMe.md file of each project !
The task involves developing a system capable of translating text from Arabic to English. This system will serve as a tool to facilitate understanding and communication between Arabic-speaking individuals and English-speaking individuals.
NLP
The project researches sentiment analysis on Twitter, with the goal of evaluating the positivity, negativity or neutrality of comments. Using Word Embeddings, an advanced method in natural language processing, our model achieved a high accuracy of 96.61%. The model was trained on Twitter data and tested on a data comment dataset from Binance.
Explore text classification with Logistic Regression and Naive Bayes models. Implementing from scratch, we compare feature engineering techniques like Bag-of-Words, TF-IDF, and Word Embedding for accurate labeling
👚 Speedy Word Embedding Association Test & Extras using R
Arabic Word Embedding models SkipGram, and GLoVE are trained over Arabic Wiki data Dump 2018 dataset from scratch using Gensim and GLoVE python libraries. Then the models are evaluated on three NLP tasks and its results are visualized in T-SNE
Compute Sentence Embeddings Fast!
Topic detection to identify the main topics on MIT management papers
Showcase of Natural Language Processing (NLP) on sentiment analysis of text in survey
D3-Network Visualization with WordEmbedding Space
Name verification model using tensorflow v2 and word embedding that classifies the input name into real and fake name with accuracy of 99%
Polysemy Embedding - Iterative approach to address the sense based embedding
In this project, the authors propose to use contextual Word2Vec model for understanding OOV (out of vocabulary). The OOV is extracted by using left-right entropy and point information entropy. They choose to use Word2Vec to construct the word vector space and CBOW (continuous bag of words) to obtain the contextual information of the words.
This is final project of Information Retrieval course which is implementation of a search engine
A Jupyter Notebook containing methods for common tasks related to the field of Natural Language Processing
We have implemented, expanded and reviewed the paper “Sense2Vec - A Fast and Accurate Method For Word Sense Disambiguation In Neural Word Embeddings" by Andrew Trask, Phil Michalak and John Liu.
Add a description, image, and links to the wordembedding topic page so that developers can more easily learn about it.
To associate your repository with the wordembedding topic, visit your repo's landing page and select "manage topics."