Can give a example on huggingface model such as Phi-2,Yi etc. #21

Arcmoon-Hu · 2024-01-18T02:26:38Z

Good job!

Hope to see comparisons with different frameworks on some models, such as throughputs, first token speed, etc.

litetoooooom · 2024-01-18T10:23:49Z

mark

shumingshi · 2024-01-18T11:46:02Z

Thank you for the nice suggestion. We are going to perform some evaluation about the speed and throughput of Inferflow in serving models of different size on different devices, and compare with some other inference engines.

About the examples of serving models like Phi-2 and Yi using Inferflow: We have predefined model specification files for Phi-2, Yi-6B-200K, Yi-34B-Chat, and a list of other models.

Below are the steps of serving Phi-2 using Inferflow:

Step-1. Download the model files from Huggingface, and place them to data/models/phi_2/
(model data downloading information can be found in data/models/phi_2/download.sh)
Step-2. Open bin/inferflow_service.ini to make sure that phi-2 is selected as the current model. In other words, the following line should be uncommented:
models = phi_2
Step-3. Set a prompt (or a query) by editing this file: /bin/llm_inference.ini
Step-4. Run the llm_inference tool to get inference results: cd bin/release; ./llm_inference
Step-5. Serve the model (via HTTP protocol): cd bin/release; ./inferflow_service

Steps 3 and 4 are for testing or validation purpose only. You can skip these two steps in real serving.

The steps of serving other models (including the Yi models which you have mentioned) is similar.

We are going to add more description in README.md to facilitate users.

MonadKai · 2024-01-18T17:56:55Z

Earlier versions of the phi-2 model may not be applicable, because the parameter n_embed in config.json is misspelled as n_embd.

shumingshi · 2024-01-26T11:19:37Z

@MonadKai Thank you for pointing out this. We will support reading the value of n_embd since it seems to be a reasonble abbreviation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can give a example on huggingface model such as Phi-2,Yi etc. #21

Can give a example on huggingface model such as Phi-2,Yi etc. #21

Arcmoon-Hu commented Jan 18, 2024

litetoooooom commented Jan 18, 2024

shumingshi commented Jan 18, 2024

MonadKai commented Jan 18, 2024

shumingshi commented Jan 26, 2024

Can give a example on huggingface model such as Phi-2,Yi etc. #21

Can give a example on huggingface model such as Phi-2,Yi etc. #21

Comments

Arcmoon-Hu commented Jan 18, 2024

litetoooooom commented Jan 18, 2024

shumingshi commented Jan 18, 2024

MonadKai commented Jan 18, 2024

shumingshi commented Jan 26, 2024