Skip to content
View azizbarank's full-sized avatar
Block or Report

Block or report azizbarank

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
azizbarank/README.md

Hey there 👋, I'm Aziz Baran 👨‍💻

I'm a graduate Linguistics student doing a master's in Machine Learning and Natural Language Processing.

I'm passionate about constantly improving myself in the fields of Data Science and Machine learning with the aim of bringing the most effective solutions to different types of business related real-world problems.

During my master's education, I realized I would enjoy leveraging AI to drive business impact in BI environments. Therefore, besides my studies, I'm learning the well-known BI tools of PowerBI and Tableau to gain insights from data and help a given business in a decision making process.

In my spare time, I write posts about my personal experience in NLP and publish them in my GitHub profile under the NLP Tutorials repository. Very soon, I'm going to add my so far dashboards done with Tableau and PowerBI as well.

🛠️ Skills

Languages

Python

Business Intelligence

Power Bi Tableau

Natural Language Processing

Hugging Face Transformers

Machine Learning

scikit-learn NumPy Pandas

IDEs & Notebooks

Visual Studio Code Jupyter Google Colab Jupyter Notebook

Other Technologies & Tools

Anaconda GitHub Microsoft Excel

📃 Projects

  • Turkish Sentiment Analyser - Hugging Face - Web App

    Fine-tuned the distilled Turkish BERT model on a review classification dataset for sentiment analysis. The final model achieved 86% accuracy and was deployed to Hugging Face Spaces using Streamlit as an interactive web app. The app provides a no-code way for people to see whether a particular review is "positive" or "negative".

  • Toxic Comment Detector - Web App

    Binary classification project to predict whether a comment is toxic or not. Three machine learning models of Multinomial Naive Bayes, Logistic Regression, and Support Vector Machine were used. The best model was a Naive Bayes classifier with TF-IDF Vectorizer with the F1 and Recall scores of 0,85 and 0,88, respectively. The application uses this model to predict the toxicity of comments.

  • cst5 - Hugging Face

    cst5 is a tiny T5 model for the Czech language that is based on the smaller version of Google's mT5 model. cst5 is meant to help people in doing experiments for the Czech language by enabling them to use a lightweight model, rather than the 101 languages-covering massive mT5. cst5 was obtained by retaining only the Czech and English embeddings of the mT5 model, during which the total size was reduced from 2.2GB to 0.9GB as a result of shrinking the original "sentencepiece" vocabulary from 250K to 30K tokens and parameters from 582M to 244M. cst5, thus, allows people to do fine-tuning for further downstream tasks in the Czech language with less size requirement and without any loss in quality from the original multilingual model.

  • Financial Sentiment Analysis with Machine Learning, LSTM, and BERT Transformer

    Financial sentiment analysis project to predict if a given financial text is to be considered as positive, negative or neutral. Machine learning, LSTM, and BERT transformer were used during the process. The best result was obtained with BERT. It achieved the accuracy score of 0.77.

💻 My Posts about NLP

Pinned

  1. azizbarank.github.io azizbarank.github.io Public

    My personal website where I share my NLP experience through blog posts.

    HTML

  2. Turkish-Sentiment-Analyser Turkish-Sentiment-Analyser Public

    This project fine-tunes the distilled Turkish BERT model on a review dataset for doing sentiment analysis. After the fine-tuning, Hugging Face Spaces and Streamlit are used to deploy the final mode…

    Jupyter Notebook

  3. distilroberta-base-sst-2-distilled distilroberta-base-sst-2-distilled Public

    Using Task Specific Knowledge Distillation to obtain DistilRoBERTa model fine-tuned on SST-2 part of the GLUE dataset for sentiment analysis.

    Jupyter Notebook 1

  4. Toxic-Comment-Detector Toxic-Comment-Detector Public

    This project applies classification models with the aim of automating the detection of toxic comments on social media. After choosing the model with the best performance, HuggingFace + Streamlit ar…

    Jupyter Notebook 1

  5. Czech-T5-Base-Model Czech-T5-Base-Model Public

    This is the t5 base model for the Czech that is based on the smaller version of the google/mt5-base model. To make this model, I retained only the Czech and some of the English embeddings from the …

    Jupyter Notebook 2

  6. Financial-Sentiment-Analysis-with-Machine-Learning-LSTM-and-BERT-Transformer Financial-Sentiment-Analysis-with-Machine-Learning-LSTM-and-BERT-Transformer Public

    This project applies three main methods to make sentiment analysis on financial data: Machine Learning, LSTM using TensorFlow with Keras API, and BERT Transformer using the "simpletransformers" lib…

    Jupyter Notebook