Skip to content

spyysalo/s800

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

S800 corpus tools

Tools for working with the S800 corpus (http://species.jensenlab.org/).

Quickstart: conversion to .ann standoff

wget http://species.jensenlab.org/files/S800-1.0.tar.gz
mkdir original-data
tar xzf S800-1.0.tar.gz -C original-data
./convert_s800.sh original-data standoff
./split_s800.sh

Convert standoff to CoNLL format

mkdir conll
git clone https://github.com/spyysalo/standoff2conll.git
for i in train devel test; do
    python3 standoff2conll/standoff2conll.py split-standoff/$i > conll/$i.tsv
done

About

Tools for working with the S800 corpus

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published