Integrate IndicTrans2 models and tokenizer into HF Transformers #30818

VarunGumma · 2024-05-15T04:20:15Z

Model description

IndicTrans2 is a multilingual transformer model developed by AI4Bharat, and is available in 3 flavors: indic-en, en-indic and indic-indic. Each flavor has 2 versions, a large 1B model, and a distilled 200M model. The architecture is a standard transformer, very similar to NLLB and M2M models. However, the major difference is the vocabularies of the encoder and decoder and not shared, as they require different languages.

Unlike, NLLB and M2M models, IndicTrans2 required specific preprocessing for the inputs. Hence a custom processor class has been developed, and is required for training/inference. More examples can be found in the aforementioned repository.

Open source status

The model implementation is available
The model weights are available

Provide useful links for the implementation

Authors: @AI4Bharat @jaygala24 @PranjalChitale @oneraghavan @VarunGumma @sumanthd17 @prajdabre @anoopkunchukuttan

Official GitHub Repository: AI4Bharat/IndicTrans2

The HF compatible models and tokenizer are available here as of now:

The text was updated successfully, but these errors were encountered:

amyeroberts · 2024-05-15T12:57:09Z

Hi @VarunGumma, thanks for opening this model request!

This looks like a great candidate for adding the model on the hub. This is the easiest and recommended way to make a model available in transformers and means, once working, the model can be found and used immediately without having to go through the PR process. We find this is a lot quicker as the bar for adding code into the library is high due to the maintenance cost of every new model, and so reviews take quite a while.

We have as much support as we can for this - let us know know if there's any issues in implementation. Here is a tutorial if that sound good to you!

VarunGumma · 2024-05-15T13:08:01Z

Hu @amyeroberts,

Thank you for your reply. We also need some help to add flash_attention_2 to our model. We were able to modify the modeling script for it, but it throws us an error that our model class IndicTransForConditionalGeneration itself is now supported. How can we proceed in this case?

amyeroberts · 2024-05-15T13:30:06Z

@VarunGumma Could you share the error message and full traceback?

VarunGumma · 2024-05-18T02:27:16Z

@amyeroberts , thank you. We were able to resolve it on our end.

VarunGumma added the New model label May 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate IndicTrans2 models and tokenizer into HF Transformers #30818

Integrate IndicTrans2 models and tokenizer into HF Transformers #30818

VarunGumma commented May 15, 2024 •

edited

amyeroberts commented May 15, 2024

VarunGumma commented May 15, 2024

amyeroberts commented May 15, 2024

VarunGumma commented May 18, 2024

Integrate IndicTrans2 models and tokenizer into HF Transformers #30818

Integrate IndicTrans2 models and tokenizer into HF Transformers #30818

Comments

VarunGumma commented May 15, 2024 • edited

Model description

Open source status

Provide useful links for the implementation

amyeroberts commented May 15, 2024

VarunGumma commented May 15, 2024

amyeroberts commented May 15, 2024

VarunGumma commented May 18, 2024

VarunGumma commented May 15, 2024 •

edited