Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[blockwise] GOI packing routine for qb4w #6373

Merged
merged 1 commit into from May 16, 2024

Conversation

digantdesai
Copy link
Contributor

Test packing-test --gtest_filter="PACK_QD8_F32_QB4W_GEMM_GOI_W.*"

@digantdesai
Copy link
Contributor Author

cc @alankelly - here is the first PR to start building blockwise support i.e. qb4w. I closed the Draft PR #6030

@@ -202,6 +202,38 @@ XNN_INTERNAL void xnn_pack_qs8_qc4w_gemm_goi_w(
size_t extra_bytes,
const struct xnn_qs8_qc4w_packing_params* params);

typedef void (*xnn_pack_qs8_qb4w_gemm_fn)(
size_t g,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Traditionally these parameters were undocumented and cryptic. Could you please break this tradition and give them better names?!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. I feel nc/kc/nr/kr are quite well "baked" in the code and we should document is somewhere like in the docs dir. It might be more meaningful for readability. I can put up a doc change tomorrow.

packing-test --gtest_filter="PACK_QD8_F32_QB4W_GEMM_GOI_W.*"
@@ -930,6 +930,45 @@ void xnn_init_qs8_to_qs8_qc8w_scale_fp32_params(
}
}

void xnn_init_qs8_qb8w_scale_fp32_params(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be named qb4w?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though you may be coming from consistency point of view with-in qb*w. I would argue that, it shouldn't matter actually because the routine is independent of weight bit width. If you look qc4w, it uses qc8w routine. I am not sure if we will ever write qb8w but this is just an artifact from qc.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is independent of quantization bit width so qs8_qb8w is not needed, maybe xnn_init_blockwise_scale_fp32_params?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Let me update.

@copybara-service copybara-service bot merged commit 084e96a into google:master May 16, 2024
3 checks passed
@digantdesai
Copy link
Contributor Author

Thanks @alankelly - let me push another PR to fix these name changes and other small things.

@digantdesai
Copy link
Contributor Author

#6434 is the follow up PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants