Skip to content

Classifying companies into one of 41 industries/genres based on their text description. Uses sentence-transformer embeddings and sklearn's NearestCentroid model.

Notifications You must be signed in to change notification settings

ekohrt/company-description-classifier

Repository files navigation

Company Description Classifier

Classifying companies into one of 41 industries based on a text description. Uses sentence-transformer embeddings and sklearn's NearestCentroid model.

Dataset is modified from this 2013 dump of Crunchbase company info, which contained company name, industry category, and crunchbase permalink. Used the free Crunchbase API to scrape the short description for each company.

Written in colab.

About

Classifying companies into one of 41 industries/genres based on their text description. Uses sentence-transformer embeddings and sklearn's NearestCentroid model.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published