Skip to content

schopra6/WebNLG2022

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Preparation of French Corpus

French Corpus is a translated version of WebNLG release3.0 English dataset. We used English to French [NMT model][[(https://storage.googleapis.com/samanantar-public/V0.3/models/en-indic.zip)]] provide by https://pytorch.org/hub/pytorch_fairseq_translation/ to generate french sentences.

To generate the french corpus

download the required packages

pip install -r requirements.txt

Generate files for train,dev and test folder

python3 run.py <path to the folder containing english xml files>

In our case, we used english language datapath as it is easy to replace english lex with french lex. WebNLG corpus can be downloaded from this repository.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages