Skip to content

Latest commit

 

History

History
131 lines (122 loc) · 8.74 KB

RESUME.md

File metadata and controls

131 lines (122 loc) · 8.74 KB

Curriculum Vitae

1. Who am I?

      

  • Name: Hyunwoong Ko (Kevin Ko)
  • Birth: 1995.09.12
  • Job: Software Engineer / AI Researcher

2. Skill Sets

  • Languages
    • Python: Excellent
    • Java: Excellent
    • C++: Available
  • Topics
    • Language model pre-training
    • Language model alignment
    • Language model optimization
    • Prompt programming
    • Korean language processing
    • Data processing and crawling
    • Distributed programming
    • DevOps / MLOps
    • Backend development
  • Libraries
    • Language model training: Pytorch, Transformers, Megatron-LM, DeepSpeed, TRL and OpenRLHF
    • Language model evaluation: LM-evaluation-harness
    • Language model optimization: Torchscript, Triton inference server, ONNX and TensorRT
    • DevOps: Docker, Kubernetes, ECS, EKS, GKE, Github Action and Code Pipeline
    • Backend development: Flask, FastAPI and Spring Boot

3. Biography

[2014.03 ~ 2020.08] BS in Software Engineering, Chonbuk National University

  1. GPA
    • 4.15 (major) / 4.07 (total)
    • 1st ranked.
  2. Opensources
    • Transformer: PyTorch implementation of Attention Is All You Need
    • KoChat: The first Korean opensource chatbot framework
  3. Awards
  4. ETC

[2019.08 ~ 2020.08] Undergraduate Researcher at Autonomous Robot Lab, Chonbuk National University

  1. Researches
  2. ETC

[2020.08 ~ 2021.02] Machine Learning Engineer at Kakao Brain

  1. BrainQA
    • I researched Korean quiz generation model.
    • This model was integrated to Pororo library
  2. Pororo
    • Pororo is an opensource multilingual NLP toolkit.
    • I developed almost all generative models in Pororo such as Question Generation, Text Summarization and Machine Translation.
  3. ETC
    • I hosted Jiphyeonjeon, a natural language processing paper review group.

[2021.03 ~ 2023.05] Co-Founder & Machine Learning Engineer at TUNiB

  1. Coco & Mas
    • Coco & Mas are Korean persona chatbots which have dog persona.
    • We collected Korean chatbot dataset with crowdsourcers, and tested crowdsourcing methods to improve data quality and yield.
    • I pre-trained 1.3B Korean model to create these chatbots and fine-tune the models using the data we've collected.
    • I researched the impact of pre-training and continual learning techniques.
    • I deployed these models with Triton Inference Server and AWS ECS cluster.
  2. BLOONY
    • BLOONY is an English chatbot powered by OpenAI GPT
    • I developed backend server using Java Spring Boot
    • I researched instruction following ability of OpenAI GPT3, and found innovate prompting methodology. In that time there's no instruction fine-tuned model, so the instruction following ability of the models was not very good. The methodology I made at the time became famous a year later as a technique called COT (Chain of thought).
    • I deployed overall services using AWS ECS cluster.
  3. TUNiBridge
    • TUNiBridge is a service that provides various natural language models API.
    • I improved safety check module's TPS from 7 to 240 using NVIDIA TensorRT, Triton Inference Server and AWS ECS.
    • TUNiB N행시 service had been very popular in the Korean Internet community for about two weeks, with about 2 million requests per day. I improved the existing system to make the service more desirable.
    • I designed and developed overall system and deployed 20+ APIs.
  4. Opensources
    • OpenChat: Easy to use opensource chatting framework via neural networks
    • Kss: The most famous Korean sentence segmentation toolkit.
    • Pecab: Pure python morpheme analyzer based on Mecab-ko-dic.
    • Large-scale LM Tutorials: Large-scale language modeling tutorials with PyTorch
    • Parallelformers: Easy-to-use transformer model deployment toolkit based on Hugging Face Transformers. The core mechanism of this library was integrated to Microsoft DeepSpeed Inference.
  5. Awards
  6. ETC
    • I became a manager of Chatbot Korea, Korean facebook group for chatbot research.

[2022.02 ~ 2023.09] Lead Machine Learning Scientist at EleutherAI

  1. Polyglot
    • Polyglot is EleutherAI's multilingual modeling project.
    • We collected 1.2 TB of korean texts for language model pre-training.
    • We released 1B, 3B, 6B, 13B models trained on large-scale Korean dataset.
    • We publihed a technical report of polyglot-ko
    • I was managing all members in Polyglot team and developing dataset preprocessing, training, evaluation pipelines.
  2. Japanese StableLM
    • We released Japanese StableLM models by co-working StabilityAI Japan.
    • We released 7B foundation language model and instruction fine-tuned model.
  3. OSLO
    • OSLO is a framework that provides various GPU based optimization technologies for large-scale modeling.
    • Features like 3D parallelism and kernel fusion which could be useful when training a large model are the key features.
    • I was developing Tensor Parallelism, Pipeline Parallelism, Kernel Fusion, Activation Checkpointing engines.

[2023.05 ~ present] Machine Learning Researcher at Kakao Brain

  1. Foundation Model Pre-training
    • I pre-trained KoGPT2, the foundation model at KakaoBrain.
    • Among its various sizes, the largest one is over 60B.
  2. Long Context Fine-tuning
    • I conducted research to extend KoGPT2 for long contexts.
    • I tested various existing techniques and internal ideas.
    • I successfully extended the model from 4k length to 32k and 65k lengths.
  3. Multimodal Fine-tuning
    • I conducted research on extending KoGPT2 for multimodal tasks, primarily text-image input and text output.
    • I combined vision encoders with auto-regressive language models, effectively handling image inputs.
  4. Codebase Management
    • I am managing various internal codebases.
    • Notably, Megatron-LM for pre-training and OpenRLHF for alignment
  5. ETC
    • I developed GKE (Google Kubernetes Engine) based evaluation pipelines.
    • I improved data deduplication pipeline, its speed has doubled by the improvement.
    • I improved monitoring / planning methods for language model pre-training.

4. How to contact?