Skip to content

Releases: RUCAIBox/TextBox

TextBox 2.0 Release

28 Dec 02:06
ebcef12
Compare
Choose a tag to compare

TextBox 2.0 is an up-to-date text generation library based on Python and PyTorch focusing on building a unified and standardized pipeline for applying pre-trained language models to text generation:

  • From a task perspective, we consider 13 common text generation tasks such as translation, story generation, and style transfer, and their corresponding 83 widely-used datasets.
  • From a model perspective, we incorporate 47 pre-trained language models/modules covering the categories of general, translation, Chinese, dialogue, controllable, distilled, prompting, and lightweight models (modules).
  • From a training perspective, we support 4 pre-training objectives and 4 efficient and robust training strategies, such as distributed data parallel and efficient generation.

Compared with the previous version of TextBox, this extension mainly focuses on building a unified, flexible, and standardized framework for better supporting PLM-based text generation models. There are three advantages of TextBox 2.0:

  • It is a significant innovation focusing on comprehensive tasks and PLMs.
  • It is designed to be unified in implementation and interface.
  • It can faithfully reproduce the results reported in existing work.

TextBox v0.2.1

15 Apr 13:39
d04499f
Compare
Choose a tag to compare

TextBox v0.2.1 Release Notes

The TextBox v0.2.1 release includes a number of wonderful new features, some bug fixes and code refactor. A few of the highlights include:

  • We add 6 new models: HRED, CVAE, T5, ProphetNet, Context2Seq and Attribute2Seq.
  • We add 3 new datasets: Persona Chat for dialog system, Amazon Electronic for attribute to text generation and Chinese Classical Poetry Corpus for poem generation.
  • We support Distributed Data Parallel (DDP) for training with multiple GPUs conveniently.
  • We refactor the codes of pretrained language models (PLMs) for improving performance.
  • We refactor the dataset and dataloader to provide unified and convenient interface.
  • We unify and simplify the generate function for each model.
  • We unify the config parameters of different models and datasets.

TextBox v0.1.5

11 Jan 03:05
d9567de
Compare
Choose a tag to compare

TextBox is an open-source library for building text generation system. It is developed based on Python and PyTorch.