Skip to content

Commit

Permalink
release v0.2.32
Browse files Browse the repository at this point in the history
  • Loading branch information
merrymercy committed Nov 1, 2023
1 parent af4dfe3 commit dd84d16
Show file tree
Hide file tree
Showing 3 changed files with 6 additions and 6 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ pip3 install -e ".[model_worker,webui]"

## Model Weights
### Vicuna Weights
[Vicuna](https://lmsys.org/blog/2023-03-30-vicuna/) is based on LLaMA and should be used under LLaMA's [model license](https://github.com/facebookresearch/llama/blob/main/LICENSE).
[Vicuna](https://lmsys.org/blog/2023-03-30-vicuna/) is based on Llama 2 and should be used under Llama's [model license](https://github.com/facebookresearch/llama/blob/main/LICENSE).

You can use the commands below to start chatting. It will automatically download the weights from Hugging Face repos.
See more command options and how to handle out-of-memory in the "Inference with Command Line Interface" section below.
Expand All @@ -84,7 +84,7 @@ See more command options and how to handle out-of-memory in the "Inference with
**Old weights**: see [docs/vicuna_weights_version.md](docs/vicuna_weights_version.md) for all versions of weights and their differences.

### LongChat
We release [LongChat](https://lmsys.org/blog/2023-06-29-longchat/) models under LLaMA's [model license](https://github.com/facebookresearch/llama/blob/main/LICENSE).
We release [LongChat](https://lmsys.org/blog/2023-06-29-longchat/) models under Llama's [model license](https://github.com/facebookresearch/llama/blob/main/LICENSE).

| Size | Chat Command | Hugging Face Repo |
| --- | --- | --- |
Expand Down Expand Up @@ -276,7 +276,7 @@ MT-bench is the new recommended way to benchmark your models. If you are still l
## Fine-tuning
### Data

Vicuna is created by fine-tuning a LLaMA base model using approximately 125K user-shared conversations gathered from ShareGPT.com with public APIs. To ensure data quality, we convert the HTML back to markdown and filter out some inappropriate or low-quality samples. Additionally, we divide lengthy conversations into smaller segments that fit the model's maximum context length. For detailed instructions to clean the ShareGPT data, check out [here](docs/commands/data_cleaning.md).
Vicuna is created by fine-tuning a Llama base model using approximately 125K user-shared conversations gathered from ShareGPT.com with public APIs. To ensure data quality, we convert the HTML back to markdown and filter out some inappropriate or low-quality samples. Additionally, we divide lengthy conversations into smaller segments that fit the model's maximum context length. For detailed instructions to clean the ShareGPT data, check out [here](docs/commands/data_cleaning.md).

We will not release the ShareGPT dataset. If you would like to try the fine-tuning code, you can run it with some dummy conversations in [dummy_conversation.json](data/dummy_conversation.json). You can follow the same format and plug in your own data.

Expand All @@ -295,7 +295,7 @@ We use similar hyperparameters as the Stanford Alpaca.
pip3 install -e ".[train]"
```

- You can use the following command to train Vicuna-7B with 4 x A100 (40GB). Update `--model_name_or_path` with the actual path to LLaMA weights and `--data_path` with the actual path to data.
- You can use the following command to train Vicuna-7B with 4 x A100 (40GB). Update `--model_name_or_path` with the actual path to Llama weights and `--data_path` with the actual path to data.
```bash
torchrun --nproc_per_node=4 --master_port=20001 fastchat/train/train_mem.py \
--model_name_or_path meta-llama/Llama-2-7b-hf \
Expand Down
2 changes: 1 addition & 1 deletion fastchat/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1 @@
__version__ = "0.2.31"
__version__ = "0.2.32"
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"

[project]
name = "fschat"
version = "0.2.31"
version = "0.2.32"
description = "An open platform for training, serving, and evaluating large language model based chatbots."
readme = "README.md"
requires-python = ">=3.8"
Expand Down

0 comments on commit dd84d16

Please sign in to comment.