Skip to content

marcschulder/svmlight_named2indexed

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

svmlight_named2indexed

Tested for Python 2.7

Short version

If you have generated data for svmlight, but haven't changed your feature names to indices yet, let this script take care of it for you.

Long version

The SVM tool svmlight requires features to be represented by unique sorted indices, rather than by their names. This script takes a data file where features are still given strings and converts them into such indices. Optionally, the mapping from index to feature name can also be saved.

Input files must follow the regular svmlight format

<line> .=. <target> <feature>:<value> <feature>:<value> ... <feature>:<value> # <info>

with the exception that instead of

<feature> .=. <integer> | "qid"

you provide

<feature> .=. <string> | "qid"

About

If you have generated data for svmlight, but haven't changed your feature names to indices yet, let this script take care of it for you.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages