Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add mx quant #1728

Merged
merged 17 commits into from May 20, 2024
Merged

add mx quant #1728

merged 17 commits into from May 20, 2024

Conversation

mengniwang95
Copy link
Collaborator

Type of Change

feature

Description

support mx quant

Expected Behavior & Potential Risk

the expected behavior that triggered by this PR

How has this PR been tested?

UT

Dependency Change?

any library dependency introduced or removed

Copy link

github-actions bot commented Apr 11, 2024

⛈️ Required checks status: Has failure 🔴

Warning
If you do not have the access to re-run the Probot, please contact XuehaoSun for help. If you push a new commit, all of the workflow will be re-triggered.

Groups summary

🟢 Code Scan Tests workflow
Check ID Status Error details
Code-Scan success
Code-Scan (Bandit Code Scan Bandit) success
Code-Scan (DocStyle Code Scan DocStyle) success
Code-Scan (Pylint Code Scan Pylint) success

These checks are required after the changes to neural_compressor/common/utils/constants.py, neural_compressor/torch/algorithms/mx_quant/__init__.py, neural_compressor/torch/algorithms/mx_quant/mx.py, neural_compressor/torch/algorithms/mx_quant/utils.py, neural_compressor/torch/quantization/__init__.py, neural_compressor/torch/quantization/algorithm_entry.py, neural_compressor/torch/quantization/config.py.

🟢 Model Tests 3x workflow
Check ID Status Error details
Model-Test-3x success
Model-Test-3x (Generate Report GenerateReport) success
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4) success
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4_dq_bnb) success
Model-Test-3x (Run PyTorch Model opt_125m_woq_gptq_int4_dq_ggml) success

These checks are required after the changes to neural_compressor/common/utils/constants.py, neural_compressor/torch/algorithms/mx_quant/__init__.py, neural_compressor/torch/algorithms/mx_quant/mx.py, neural_compressor/torch/algorithms/mx_quant/utils.py, neural_compressor/torch/quantization/__init__.py, neural_compressor/torch/quantization/algorithm_entry.py, neural_compressor/torch/quantization/config.py.

🟢 Unit Tests 3x-TensorFlow workflow
Check ID Status Error details
UT-3x-TensorFlow success
UT-3x-TensorFlow (Coverage Compare CollectDatafiles) success
UT-3x-TensorFlow (Unit Test 3x TensorFlow Unit Test 3x TensorFlow) success
UT-3x-TensorFlow (Unit Test 3x TensorFlow baseline Unit Test 3x TensorFlow baseline) success

These checks are required after the changes to neural_compressor/common/utils/constants.py.

🔴 Unit Tests 3x-PyTorch workflow
Check ID Status Error details
UT-3x-Torch failure
UT-3x-Torch (Coverage Compare CollectDatafiles) failure download
UT-3x-Torch (Unit Test 3x Torch Unit Test 3x Torch) success
UT-3x-Torch (Unit Test 3x Torch baseline Unit Test 3x Torch baseline) success

These checks are required after the changes to neural_compressor/common/utils/constants.py, neural_compressor/torch/algorithms/mx_quant/__init__.py, neural_compressor/torch/algorithms/mx_quant/mx.py, neural_compressor/torch/algorithms/mx_quant/utils.py, neural_compressor/torch/quantization/__init__.py, neural_compressor/torch/quantization/algorithm_entry.py, neural_compressor/torch/quantization/config.py, test/3x/torch/quantization/test_mx_quant.py.

🔴 Unit Tests 3x-ONNXRT workflow
Check ID Status Error details
UT-3x-ONNXRT failure
UT-3x-ONNXRT (Coverage Compare CollectDatafiles) no_status
UT-3x-ONNXRT (Unit Test 3x ONNXRT Unit Test 3x ONNXRT) failure download
UT-3x-ONNXRT (Unit Test 3x ONNXRT baseline Unit Test 3x ONNXRT baseline) failure download

These checks are required after the changes to neural_compressor/common/utils/constants.py.


Thank you for your contribution! 💜

Note
This comment is automatically generated and will be updates every 180 seconds within the next 6 hours. If you have any other questions, contact chensuyue or XuehaoSun for help.

Copy link
Collaborator

@xin3he xin3he left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a document to introduce the MX workflow and demo the usage?

@mengniwang95 mengniwang95 force-pushed the mengni/torch_mx branch 2 times, most recently from 4235c72 to 988e9a8 Compare April 11, 2024 06:50
@chensuyue
Copy link
Contributor

@mengniwang95 any update?

@chensuyue
Copy link
Contributor

Does this feature target v2.6?

mengniwang95 and others added 5 commits May 9, 2024 12:06
Signed-off-by: Mengni Wang <mengni.wang@intel.com>
Signed-off-by: Mengni Wang <mengni.wang@intel.com>
Signed-off-by: Mengni Wang <mengni.wang@intel.com>
Signed-off-by: Mengni Wang <mengni.wang@intel.com>
Signed-off-by: Mengni Wang <mengni.wang@intel.com>
mengniwang95 referenced this pull request May 9, 2024
Signed-off-by: Mengni Wang <mengni.wang@intel.com>
mengniwang95 and others added 4 commits May 9, 2024 14:01
Signed-off-by: Mengni Wang <mengni.wang@intel.com>
Signed-off-by: Mengni Wang <mengni.wang@intel.com>
Signed-off-by: Mengni Wang <mengni.wang@intel.com>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not able to run the example, here is what i did in a fresh venv
git clone https://github.com/intel/neural-compressor.git
cd neural-compressor
gh pr checkout 1728
pip install -r requirements.txt
python setup.py install
cd /neural-compressor/examples/3.x_api/pytorch/nlp/huggingface_models/language-modeling/quantization/mx
pip install -r requirements.txt
python3 -u run_clm_no_trainer.py --model bigscience/bloom-560m --quantize --accuracy --tasks lambada_openai --w_dtype fp4 --woq

then i get the error
2024-05-09 13:28:52 [INFO][algorithm_entry.py:512] Quantize model with the mx quant algorithm.
2024-05-09 13:29:01 [INFO][run_clm_no_trainer.py:61] Quantization end.
Traceback (most recent call last):
File "/home/anthony/venv_pr1728/neural-compressor/examples/3.x_api/pytorch/nlp/huggingface_models/language-modeling/quantization/mx/run_clm_no_trainer.py", line 66, in
from intel_extension_for_transformers.transformers.llm.evaluation.lm_eval import evaluate
File "/home/anthony/venv_pr1728/lib/python3.10/site-packages/intel_extension_for_transformers/transformers/llm/evaluation/lm_eval/init.py", line 17, in
from .accuracy import cli_evaluate as evaluate
File "/home/anthony/venv_pr1728/lib/python3.10/site-packages/intel_extension_for_transformers/transformers/llm/evaluation/lm_eval/accuracy.py", line 42, in
from intel_extension_for_transformers.transformers.llm.evaluation.lm_eval import evaluator
File "/home/anthony/venv_pr1728/lib/python3.10/site-packages/intel_extension_for_transformers/transformers/llm/evaluation/lm_eval/evaluator.py", line 29, in
import lm_eval.api.metrics
ModuleNotFoundError: No module named 'lm_eval.api'

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your reminder, we will update the requirements.txt. Please install intel-extension-for-transformers >= 1.4.1 and lm-eval==0.4.2

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worked!

Signed-off-by: Mengni Wang <mengni.wang@intel.com>
@chensuyue chensuyue merged commit 4a24a6a into master May 20, 2024
30 of 36 checks passed
@chensuyue chensuyue deleted the mengni/torch_mx branch May 20, 2024 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants