Skip to content

junhua/IPOD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 

Repository files navigation

Industrial and Professional Occupations Dataset (IPOD)

License: CC BY 4.0

This repo includes:

  • A Gazetteer of tokens and NE tags annotated by 3 domain experts
  • A Corpus of 475,085 job titles crawled from Linkedin, with NE tags prefixed using BIOES schemes
  • Title2Vec pre-trained job title embedding finetuned from ELMo. Checkpoint available for Download.

Citing IPOD

Please cite the following papers when using IPOD:

@inproceedings{liu2020ipod,
  title={IPOD: A Large-scale Industrial and Professional Occupation Dataset},
  author={Liu, Junhua and Ng, Yung Chuen and Wood, Kristin L. and Lim, Kwan Hui},
  booktitle={Proceedings of the 2020 ACM Conference on Computer Supported Cooperative Work and Social Computing Companion (CSCW'20)},
  pages={323--328},
  year={2020}
}

About

A Corpus of 475,000 Industrial Occupations

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published