Skip to content
View NitzanBarzilay's full-sized avatar
๐Ÿ‘‹
Hey!
๐Ÿ‘‹
Hey!

Highlights

  • Pro
Block or Report

Block or report NitzanBarzilay

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
NitzanBarzilay/README.md

๐Ÿ’ซ About Me

๐Ÿ”ญ NLP geek, Computer Science MSc candidate researching NLP (multi-document summarization) at HebrewU & Data Scientist at PayPal
๐Ÿ’ฌ Feel free to ask me about any of my repos, I love getting messages about my work! LinkedIn Twitter
โ˜• If my code or my notes helped you, you can buy me a coffee if you'd like Ko-Fi

๐Ÿ’ป Tech Stack

Python NumPy Pandas Plotly scikit-learn MySQL Postgres C++ Java

๐Ÿ‘ฉโ€๐Ÿ’ป NLP / Data Science Public Projects

Corpify (2023)

A Language model that rephreases spoken English into workplace-appropriate language!

In this project, we introduce the novel NLP task of corpy textual style-transfer, which involves the transformation of casual English text into a style suited for a professional workplace setting. We constructed an original parallel corpus comprising 634 sentences in casual English and their corporate-style paraphrases.

This project includes the dataset itself, the code for fine-tuning the style transfer models, 2 of the best performing fine-tuned models, and code for fine-tuning a style detection model for detecting corpy style in text.

Methods used in this NLP projects: Textual style transfer, text classification.

Python

image

An independant multi-phase NLP project for classifying parlemintary quotes in Hebrew into 8 topics. Also includes the annotated dataset.

In this project, I started with a raw dataset of quotes (in Hebrew) gathered from protocoles of the Knesset (the Israeli parliment). In the first stage of the project, I used unsupervised topic modeling methods in order to cluster quotes by topics. The topic assignment that was created during the first stage were used to prioritize qoutes for manual tagging process - quotes with the highest confidence score were sent to mannual tagging. This process created ~2,700 quotes that were manually tagged into 8 topics (in addition to a "no topic" tag). Then, in the second phase of this project, I trained a supervised classifier to predict quotes topics.

Methods used in this NLP projects: Topic modeling (unsupervised), Topic classification (supervised).

Python

image

AI assistant that helps groups of friends or co-workers find a restaurant to order from together, that best matches the group members' dining preferences.

In this project, we used restaurants menus gathered via Wolt's API and created a smart system that helps groups of friends or co-workers find a single restaurant that matches everyone's needs and preferences (such as vegeterianism, price limits, prefered cuisines etc). We examined several different algorithms (neither are ML-based), all of them provided solutions who were incredibly close to the optimal solution (that could be found by iterating over the entire 30M combinations dataset) in a fraction of the time (up to 11K times faster)!

Methods used in this AI projects: local search, genetic algorithms.

Python

image

๐Ÿ“„ Detailed NLP Notes and Resources (Hebrew)

I have collected all of the detailed notes I wrote during my studies at HebrewU as well as courses I studied independently, and published them as a part of my goal to make Data Science & NLP topics more accessible to Hebrew speakers.

This colelction contains detailed notes in Hebrew on subjects such as Math (Calculus, Linear Algebra, Probability, Discrete Math), foundations of Computer Science (Data Structures, Algorithms, Complexity), as well as advanced Data Science (Machine Learning, NLP).

This includes my recent detailed notes (90 pages) for Stanford's CS224N (NLP with DL) course, that gained more than 1K likes across Israeli DS & ML communities, and featured in MDLI newsletter as "If you need to read only one post this week, make it this one".

Recently I decided to share my private Notion hub where I organize all of my NLP knowledge (mostly in Hebrew). This hub is meant for my personal use, but since many people found it useful I decided to share it. It conatains some notes by topic (such as NLP tasks, architectures or uses) that has significant ovarlaps with my CS224N notes, as well as noted I wrote for a few dozens NLP papers I have read in the past year for my studies and my reasearch.

image

If you want to use this resource, it is highly recommended to download Notion Enhancer and enable it's right-to-left feature, since currently Notion doed not support RTL.

I shared my simple-but-useful system for queueing and reviewing papers I read (or plan to read). This Notion template is free to use, and also contains tips on how to personalize it to work for your needs. image

Pinned

  1. Notes Notes Public

    Detailed notes of various Computer Science courses - mostly courses form HebrewU but also a variety of other courses I studied independently.

    8

  2. KnessetTopicClassification KnessetTopicClassification Public

    ืคืจื•ื™ืงื˜ ืœื™ืฆื™ืจืช ืžื•ื“ืœ ืงืœืกื™ืคื™ืงืฆื™ื” ื”ืžืกื•ื•ื’ ืฆื™ื˜ื•ื˜ื™ื ืžืคืจื•ื˜ื•ืงืœื™ ื™ืฉื™ื‘ื•ืช ื”ื›ื ืกืช ืœืฉืžื•ื ื” ื ื•ืฉืื™ื.

    Jupyter Notebook 7

  3. PickUsLunch PickUsLunch Public

    AI assistant that helps groups of friends or co-workers find a restaurant to order from together, that best matches the group members' dining preferences.

    Python 3

  4. The_Chase_Risk_Analysis The_Chase_Risk_Analysis Public

    Analysis of the participants in 3 seasons of the Israeli TV game show "The Chase".

    Jupyter Notebook 1