Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some doubts about specific meaning of fbgemmConv parameters and the layouts of input and output data? #1181

Open
hxer7963 opened this issue Jun 28, 2022 · 0 comments

Comments

@hxer7963
Copy link

Hello.

I want to compute conv with uint8 activations and int8 weight data to get int32 output result, the quantization and dequantization process is before and after the fbgemmConv call.
The layouts of my uint8 activations and int8 weight are NCHW and (OC, IC, KH, KW) respectively.

After took a closer look at the example code under bench/ConvUnifiedBenchmark.cc, I'm a little confused about the call flow, input and output data layout and call parameters.

The example code snippet in bench/ConvUnifiedBenchmark.cc is as follows:

PackWeightsForConv<SPATIAL_DIM> packedB(conv_p, Bint8.data());
...
for (int g = 0; g < conv_p.G; ++g) {
      row_offsets_u8acc32_ref(
          MDim,
          KDimPerGroup,
          KDim,
          Aint8_im2col.data() + g * KDimPerGroup,
          row_offsets.data());

      requantize_u8acc32_ref(
          MDim,
          NDim,
          conv_p.G * NDim,
          Cint32_ref.data() + g * NDim,
          Cint8_ref.data() + g * NDim,
          C_multiplier.data() + g * NDim / conv_p.OC,
          C_zero_point,
          Aint8_zero_point,
          Bint8_zero_point.data() + g * NDim / conv_p.OC,
          row_offsets.data(),
          col_offsets.data() + g * NDim,
          nullptr,
          conv_p.OC);
    }
...
DoNothing<> doNothingObj{};
ReQuantizeOutput<false, QuantizationGranularity::TENSOR> outputProcObj(
    doNothingObj,
    C_multiplier.data(),
    C_zero_point,
    Aint8_zero_point,
    Bint8_zero_point.data(),
    nullptr, // row offsets
    col_offsets.data(),
    nullptr, // bias
    conv_p.OC,
    conv_p.G);
...
fbgemmConv(
  conv_p,
  Aint8.data(),
  packedB,
  Cint8_fb.data(),
  Cint32_fb.data(),
  outputProcObj,
  tid,
  num_threads);

According to the context of the code, I guess the layout of Aint8 should be NHWC, the layout of Bint8 is KRSC i.e., (OC, H, W, IC), and the value of Aint8_zero_point shout be the activation padding value.

There are some doubts about above code snippet:

  1. What's the meaning and specific value of C_multiplier, C_zero_point, Bint8_zero_point, and col_offsets, and how to set these values in my conv quantization code when call fbgemmConv?
  2. What is the specific layout of input and output data? Whether other layouts are supported?
  3. What is the point of calling row_offsets_u8acc32_ref and requantize_u8acc32_ref, when the quantization and dequantization process is done outside.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants