Skip to content
View datquocnguyen's full-sized avatar

Organizations

@VinAIResearch
Block or Report

Block or report datquocnguyen

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
datquocnguyen/README.md

Dat Quoc Nguyen is a Senior Research Scientist and the Head of the Natural Language Processing department at VinAI Research, Vietnam. He was an Honorary Fellow in the School of Computing and Information Systems at the University of Melbourne, Australia, where previously he was a Research Fellow. Before that, he received his Ph.D. in Computer Science from Macquarie University, Australia.

Dat Quoc Nguyen is the author of 70 peer-reviewed publications covering core NLP problems, ML methods for NLP and their applications for low-resource languages and specific domains, with over 5000 citations and an h-index of 33 (Google Scholar). He released many ML/NLP toolkits and datasets, which are widely used in both academia and industry. He also created large language models and other foundation models, including PhoGPT, PhoBERT, BARTpho, XPhoneBERT and BERTweet, with millions of downloads.

Pinned

  1. vncorenlp/VnCoreNLP vncorenlp/VnCoreNLP Public

    A Vietnamese natural language processing toolkit (NAACL 2018)

    Java 562 139

  2. VinAIResearch/PhoBERT VinAIResearch/PhoBERT Public

    PhoBERT: Pre-trained language models for Vietnamese (EMNLP-2020 Findings)

    628 91

  3. VinAIResearch/BERTweet VinAIResearch/BERTweet Public

    BERTweet: A pre-trained language model for English Tweets (EMNLP-2020)

    Python 561 51

  4. VinAIResearch/PhoNLP VinAIResearch/PhoNLP Public

    PhoNLP: A BERT-based multi-task learning model for part-of-speech tagging, named entity recognition and dependency parsing (NAACL 2021)

    Python 131 17

  5. VinAIResearch/XPhoneBERT VinAIResearch/XPhoneBERT Public

    XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech (INTERSPEECH 2023)

    Python 287 36

  6. VinAIResearch/PhoGPT VinAIResearch/PhoGPT Public

    PhoGPT: Generative Pre-training for Vietnamese (2023)

    Python 726 62