Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Classification with LoRA and LLAMA/Mistral example #1592

Open
2 of 4 tasks
bjayakumar opened this issue May 13, 2024 · 5 comments
Open
2 of 4 tasks

Classification with LoRA and LLAMA/Mistral example #1592

bjayakumar opened this issue May 13, 2024 · 5 comments
Assignees
Labels
question Further information is requested triaged Issue has been triaged by maintainers

Comments

@bjayakumar
Copy link

System Info

TensorRT-LLM 0.8
Nvidia A100
Cuda 12.2

Who can help?

I can run LLAMA with LoRA weights for text generative tasks like completion and summarization. However, I have trained a model with classification LoRA. When I tried to run with run.py, I can able to do generative tasks. However, I wanted to do classifcation.

Can you post an example where LLM like LLAMA or MISTRAL is fine tuned with classification head using LoRA and convert those checkpoints and do classification by running run.py?

It appears model_runner does only decode?

outputs = self.session.decode(

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

.

Expected behavior

A working example for classification with LoRa and LLM

actual behavior

Unavailable

additional notes

.

@bjayakumar bjayakumar added the bug Something isn't working label May 13, 2024
@byshiue
Copy link
Collaborator

byshiue commented May 15, 2024

The requirement is not very clear for me. Do you want to get the context logits of model instead of generating tokens? If so, you could get the logits of context by adding --gather_context_logits during building engine.

@byshiue byshiue self-assigned this May 15, 2024
@byshiue byshiue added question Further information is requested triaged Issue has been triaged by maintainers and removed bug Something isn't working labels May 15, 2024
@bjayakumar
Copy link
Author

bjayakumar commented May 15, 2024

Sorry I was not clear.

I took HF LLM like Mistral and used it for classification task using LoRA.
I have saved the model.

I can build an engine with Foundation model and LoRA which does classification.

When I run, run.py with that engine, I can only get generation output not classification

python ../run.py --engine_dir "$ENGINE_DIR" \
              --max_output_len 10 \
              --tokenizer_dir ${BASE_LLAMA_MODEL} \
              --input_text "i want to pay for my book" \
              --lora_task_uids 0 \
              --use_py_session --top_p 0.5 --top_k 0

I would like to know how to get classification output.

@byshiue
Copy link
Collaborator

byshiue commented May 17, 2024

TensorRT-LLM only provided the examples about generation. For classification, you might need to change the example script.

@badrinath89
Copy link

Can you pleaser share pointer or direction to solve this?

What changes should I make in the code?

@bjayakumar
Copy link
Author

TensorRT-LLM only provided the examples about generation. For classification, you might need to change the example script.

How do you change the example script? Can you suggest a psuedo code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

3 participants