You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was wondering if there is a way to resume qunatization from history.snapshot?
I am using onnx and onnxrt_cuda_ep.
I am can qunatize the model but before saving the model, the code crashes (not related to inc); is there away to continue from history.snapshot instead of running the code from the beginning.
Applying AWQ clip
Progress: [####################] 100.00%2024-05-07 14:56:05 [INFO] |Mixed Precision Statistics|
2024-05-07 14:56:05 [INFO] +------------+---------+---------------+
2024-05-07 14:56:05 [INFO] | Op Type | Total | A32W4G32 |
2024-05-07 14:56:05 [INFO] +------------+---------+---------------+
2024-05-07 14:56:05 [INFO] | MatMul | 193 | 193 |
2024-05-07 14:56:05 [INFO] +------------+---------+---------------+
2024-05-07 14:56:05 [INFO] Pass quantize model elapsed time: 6294630.87 ms
2024-05-07 14:56:05 [INFO] Save tuning history to C:\llm\quantization\nc_workspace\2024-05-07_13-10-57./history.snapshot.
2024-05-07 14:56:05 [INFO] [Strategy] Found the model meets accuracy requirements, ending the tuning process.
2024-05-07 14:56:05 [INFO] Specified timeout or max trials is reached! Found a quantized model which meet accuracy goal. Exit.
2024-05-07 14:56:05 [INFO] Save deploy yaml to C:\llm\quantization\nc_workspace\2024-05-07_13-10-57\deploy.yaml
The text was updated successfully, but these errors were encountered:
I was wondering if there is a way to resume qunatization from history.snapshot?
I am using onnx and onnxrt_cuda_ep.
I am can qunatize the model but before saving the model, the code crashes (not related to inc); is there away to continue from history.snapshot instead of running the code from the beginning.
Applying AWQ clip
Progress: [####################] 100.00%2024-05-07 14:56:05 [INFO] |Mixed Precision Statistics|
2024-05-07 14:56:05 [INFO] +------------+---------+---------------+
2024-05-07 14:56:05 [INFO] | Op Type | Total | A32W4G32 |
2024-05-07 14:56:05 [INFO] +------------+---------+---------------+
2024-05-07 14:56:05 [INFO] | MatMul | 193 | 193 |
2024-05-07 14:56:05 [INFO] +------------+---------+---------------+
2024-05-07 14:56:05 [INFO] Pass quantize model elapsed time: 6294630.87 ms
2024-05-07 14:56:05 [INFO] Save tuning history to C:\llm\quantization\nc_workspace\2024-05-07_13-10-57./history.snapshot.
2024-05-07 14:56:05 [INFO] [Strategy] Found the model meets accuracy requirements, ending the tuning process.
2024-05-07 14:56:05 [INFO] Specified timeout or max trials is reached! Found a quantized model which meet accuracy goal. Exit.
2024-05-07 14:56:05 [INFO] Save deploy yaml to C:\llm\quantization\nc_workspace\2024-05-07_13-10-57\deploy.yaml
The text was updated successfully, but these errors were encountered: