Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Continue quantization from history.snapshot #1778

Open
oyazdanb opened this issue May 8, 2024 · 3 comments
Open

Continue quantization from history.snapshot #1778

oyazdanb opened this issue May 8, 2024 · 3 comments
Assignees
Labels
aitce AI TCE to handle it firstly help wanted Extra attention is needed

Comments

@oyazdanb
Copy link

oyazdanb commented May 8, 2024

I was wondering if there is a way to resume qunatization from history.snapshot?

I am using onnx and onnxrt_cuda_ep.

I am can qunatize the model but before saving the model, the code crashes (not related to inc); is there away to continue from history.snapshot instead of running the code from the beginning.

Applying AWQ clip
Progress: [####################] 100.00%2024-05-07 14:56:05 [INFO] |Mixed Precision Statistics|
2024-05-07 14:56:05 [INFO] +------------+---------+---------------+
2024-05-07 14:56:05 [INFO] | Op Type | Total | A32W4G32 |
2024-05-07 14:56:05 [INFO] +------------+---------+---------------+
2024-05-07 14:56:05 [INFO] | MatMul | 193 | 193 |
2024-05-07 14:56:05 [INFO] +------------+---------+---------------+
2024-05-07 14:56:05 [INFO] Pass quantize model elapsed time: 6294630.87 ms
2024-05-07 14:56:05 [INFO] Save tuning history to C:\llm\quantization\nc_workspace\2024-05-07_13-10-57./history.snapshot.
2024-05-07 14:56:05 [INFO] [Strategy] Found the model meets accuracy requirements, ending the tuning process.
2024-05-07 14:56:05 [INFO] Specified timeout or max trials is reached! Found a quantized model which meet accuracy goal. Exit.
2024-05-07 14:56:05 [INFO] Save deploy yaml to C:\llm\quantization\nc_workspace\2024-05-07_13-10-57\deploy.yaml

@xiguiw xiguiw self-assigned this May 9, 2024
@xiguiw
Copy link

xiguiw commented May 9, 2024

Hi @oyazdanb,

Welcome to neural-compressor~

Yes, there is some function to resume qunatization from history.snapshot.

I'll check the function and feedback to you ASAP.

@xiguiw xiguiw added the help wanted Extra attention is needed label May 9, 2024
@xiguiw
Copy link

xiguiw commented May 10, 2024

@oyazdanb the recover is borken for some models (not for all).
Development team is working to fix it.

During the time, I show you the way to recover from history.snapshot, you can try your model to check if it works for your model.

If it does not work, you can:
1). wait for some days.
I'll notify you after it is being fixed.

  1. install neural-compresson 2.0 and recover with 2.0.
    We do not recommed to roll back to earlier version though.

Here is the way you can try to recover. Not sure it works for you model now.

     from neural_compressor.utils.utility import recover
     recover_qmodel = recover( fp32_onnx_model, "./nc_workspace/2024-05-10_19-16-32/history.snapshot", 0)

Here is the define of recover

 365 def recover(fp32_model, tuning_history_path, num, **kwargs):
 366     """Get offline recover tuned model.
 367
 368     Args:
 369         fp32_model: Input model path
 370         tuning_history_path: The tuning history path, which needs user to assign
 371         num: tune index
 372     """

@xiguiw
Copy link

xiguiw commented May 11, 2024

Fix borken recover. PR:
#1788

@xiguiw xiguiw added the aitce AI TCE to handle it firstly label May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aitce AI TCE to handle it firstly help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants