[QST] use FastLinearCombinationClamp to convert half accumulator to int8_t output? #1516

ken012git · 2024-04-30T03:55:45Z

Hello, could I use FastLinearCombinationClamp to convert half_t accumulator to int8_t output? or it only supports int32_t accumulator to int8_t output? Thanks!

  using ElementInputA = cutlass::half_t; // <- data type of elements in input matrix A
  using ElementInputB = cutlass::half_t; // <- data type of elements in input matrix B
  using ElementAccumulator = cutlass::half_t;
  using ElementOutput = int8_t;

  using LayoutInputA = cutlass::layout::RowMajor;
  using LayoutInputB = cutlass::layout::ColumnMajor;
  using LayoutOutput = cutlass::layout::RowMajor;

  using Gemm = cutlass::gemm::device::Gemm<
      ElementInputA, LayoutInputA,
      ElementInputB, LayoutInputB,
      ElementOutput, LayoutOutput,
      ElementAccumulator,
      cutlass::arch::OpClassTensorOp,
      cutlass::arch::Sm80,
      cutlass::gemm::GemmShape<128, 128, 32>,
      cutlass::gemm::GemmShape<64, 64, 32>,
      cutlass::gemm::GemmShape<16, 8, 16>,
      cutlass::epilogue::thread::FastLinearCombinationClamp<
          ElementOutput, 128 / cutlass::sizeof_bits<ElementOutput>::value>
    >;

Trying FastLinearCombinationClamp to convert half_t accumulator to int8_t output throws the error:

cutlass/include/cutlass/epilogue/threadblock/epilogue.h(261): error: no instance of overloaded function "cutlass::epilogue::thread::FastLinearCombinationClamp<ElementOutput_, Count, Scale, Round>::operator() [with ElementOutput_=int8_t, Count=16, Scale=cutlass::epilogue::thread::ScaleType::Default, Round=cutlass::FloatRoundStyle::round_to_nearest]" matches the argument list
            argument types are: (const cutlass::Array<cutlass::half_t, 16, false>, const cutlass::Array<int8_t, 16, false>)
            object type is: const cutlass::epilogue::thread::FastLinearCombinationClamp<int8_t, 16, cutlass::epilogue::thread::ScaleType::Default, cutlass::FloatRoundStyle::round_to_nearest>
          output_frag_ptr[i] = output_op(compute_frag_ptr[i], source_frag_ptr[i]);

The text was updated successfully, but these errors were encountered:

github-actions · 2024-05-30T04:06:12Z

This issue has been labeled inactive-30d due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d if there is no activity in the next 60 days.

ken012git added ? - Needs Triage question Question labels Apr 30, 2024

github-actions bot added the inactive-30d label May 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QST] use FastLinearCombinationClamp to convert half accumulator to int8_t output? #1516

[QST] use FastLinearCombinationClamp to convert half accumulator to int8_t output? #1516

ken012git commented Apr 30, 2024

github-actions bot commented May 30, 2024

[QST] use FastLinearCombinationClamp to convert half accumulator to int8_t output? #1516

[QST] use FastLinearCombinationClamp to convert half accumulator to int8_t output? #1516

Comments

ken012git commented Apr 30, 2024

github-actions bot commented May 30, 2024