BahasaRojakSentimentAnalysis 😸😑😾

Handling Bahasa Rojak (Malaysian Code Mixing Language) OOV and performing Sentiment Analysis using downstreamed Cross Lingual Model XLM-RoBERTa (XLM-T)

Jupyter Notebooks includes detailing of:

Text Preprocessing
Model Fine Tuning
New Data Inference Pipeline

For further resources regarding the project, please access link below.

Access the project here: https://drive.google.com/drive/folders/12Uir9KE4B1VL6oQWdj2BWvCUZOC0vWa2

Ablation Settings:

Preprocessing Method	Model 1 (V1)	Model 2 (V2)	Model 3 (V3)	Model 4 (V4)
Remove URLs	✔	✔	✔	✔
Convert Lowercase	✔	✔	✔	-
Remove Punctuations	✔	✔	✔	-
Remove Irregular Spaces	✔	✔	✔	✔
Handle OOV	✔	✔	✔	✔
Remove Stopwords	✔	✔	-	-
Chinese Character Segmentation	-	✔	✔	-
Remove Rare Words	-	-	✔	-

Model Results:

	Precision		Recall		F1-Score		Accuracy
	0	1	0	1	0	1
Model V1	0.716	0.830	0.840	0.702	0.773	0.760	0.767
Model V2	0.768	0.771	0.735	0.801	0.751	0.786	0.770
Model V3	0.794	0.703	0.691	0.802	0.739	0.749	0.744
Model V4	0.861	0.833	0.802	0.884	0.831	0.858	0.845

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
NLP Final Report.pdf		NLP Final Report.pdf
README.md		README.md
RojakEnsembleModel.ipynb		RojakEnsembleModel.ipynb
SentiDataPreprocessing.ipynb		SentiDataPreprocessing.ipynb
XLM-T - Fine-tuning on custom datasets		XLM-T - Fine-tuning on custom datasets

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NLP Final Report.pdf

NLP Final Report.pdf

README.md

README.md

RojakEnsembleModel.ipynb

RojakEnsembleModel.ipynb

SentiDataPreprocessing.ipynb

SentiDataPreprocessing.ipynb

XLM-T - Fine-tuning on custom datasets

XLM-T - Fine-tuning on custom datasets

Repository files navigation

BahasaRojakSentimentAnalysis 😸😑😾

Ablation Settings:

Model Results:

Web Application to Test out the Sentiment Analysis Model (w/ Twitter Web Scraping):

Scrap tweets related to "britneyspears":

Inference Results:

About

Releases

Packages

Languages

Bernardbyy/BahasaRojakSentimentAnalysis

Folders and files

Latest commit

History

Repository files navigation

BahasaRojakSentimentAnalysis 😸😑😾

Ablation Settings:

Model Results:

Web Application to Test out the Sentiment Analysis Model (w/ Twitter Web Scraping):

Scrap tweets related to "britneyspears":

Inference Results:

About

Topics

Resources

Stars

Watchers

Forks

Languages