是否支持Qwen1.5-7B的量化版本？ #32

huliangbing · 2024-04-14T23:37:16Z

非常好的工作！请问InfLLM是否支持Qwen1.5-7B的量化版本？

ChuanhongLi · 2024-04-15T02:38:09Z

#16 我之前验证过 Qwen1.5-72B-Chat-GPTQ-Int4，其他的量化模型应该都类似

huliangbing · 2024-04-15T04:09:13Z

谢谢！效果如果？速度如何？

ChuanhongLi · 2024-04-16T01:14:36Z

谢谢！效果如果？速度如何？

模型太大，跑的比较慢，简单测试了 LongBench的两组数据：Evaluating on: ['narrativeqa.jsonl', 'qasper.jsonl', 'result.json']
{'narrativeqa': 23.66, 'qasper': 40.86}

ehuaa · 2024-04-22T09:14:04Z

谢谢！效果如果？速度如何？

模型太大，跑的比较慢，简单测试了 LongBench的两组数据：Evaluating on: ['narrativeqa.jsonl', 'qasper.jsonl', 'result.json'] {'narrativeqa': 23.66, 'qasper': 40.86}

想问下量化的时候校准数据集是怎么选取的呢，是从LongBench中sample的数据么 @ChuanhongLi

huliangbing · 2024-04-22T14:50:36Z

@ChuanhongLi 请教一下，修改哪个文件？如何修改？

ChuanhongLi · 2024-04-23T01:09:26Z

sample的数据么 @ChuanhongLi

量化的模型，我们是直接用的开源的，并未采用我们自己量化的模型

ChuanhongLi · 2024-04-23T01:13:24Z

哪个文件？如何修改？

调整下 n_local, topk, max_cached_block, chunk_size等, 可以参考下这个 #11

ChuanhongLi · 2024-04-23T01:16:31Z

哪个文件？如何修改？

调整下 n_local, topk, max_cached_block, chunk_size等, 可以参考下这个 #11

config/qwen-inf-llm.yaml

huliangbing · 2024-04-23T03:14:03Z

@ChuanhongLi 非常感谢！

ehuaa · 2024-04-30T08:54:18Z

哪个文件？如何修改？

调整下 n_local, topk, max_cached_block, chunk_size等, 可以参考下这个 #11

config/qwen-inf-llm.yaml

@ChuanhongLi 您好，我用Qwen1.5-72B-chat-AWQ和GPTQ的版本在A100 80G上都爆显存了，想问下config需要具体修改哪些参数到什么数值呢，我用repo中原生的qwen的版本爆显存了，方便粘贴下您跑通的版本么，谢谢

ChuanhongLi · 2024-05-06T01:22:30Z

方便粘贴下您跑通的版本么，谢谢

block_size: 128
n_init: 128
n_local: 2048
topk: 4
repr_topk: 4
max_cached_block: 4
exc_block_size: 512
score_decay: 0.1
fattn: true
base: 1000000
distance_scale: 1.0

max_len: 2147483647
chunk_size: 1024
conv_type: qwen

huliangbing closed this as completed May 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

是否支持Qwen1.5-7B的量化版本？ #32

是否支持Qwen1.5-7B的量化版本？ #32

huliangbing commented Apr 14, 2024

ChuanhongLi commented Apr 15, 2024

huliangbing commented Apr 15, 2024 •

edited

ChuanhongLi commented Apr 16, 2024

ehuaa commented Apr 22, 2024 •

edited

huliangbing commented Apr 22, 2024

ChuanhongLi commented Apr 23, 2024

ChuanhongLi commented Apr 23, 2024

ChuanhongLi commented Apr 23, 2024

huliangbing commented Apr 23, 2024

ehuaa commented Apr 30, 2024

ChuanhongLi commented May 6, 2024

是否支持Qwen1.5-7B的量化版本？ #32

是否支持Qwen1.5-7B的量化版本？ #32

Comments

huliangbing commented Apr 14, 2024

ChuanhongLi commented Apr 15, 2024

huliangbing commented Apr 15, 2024 • edited

ChuanhongLi commented Apr 16, 2024

ehuaa commented Apr 22, 2024 • edited

huliangbing commented Apr 22, 2024

ChuanhongLi commented Apr 23, 2024

ChuanhongLi commented Apr 23, 2024

ChuanhongLi commented Apr 23, 2024

huliangbing commented Apr 23, 2024

ehuaa commented Apr 30, 2024

ChuanhongLi commented May 6, 2024

huliangbing commented Apr 15, 2024 •

edited

ehuaa commented Apr 22, 2024 •

edited