Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NNPA] Warning messages for legality check #2819

Open
wants to merge 34 commits into
base: main
Choose a base branch
from

Conversation

imaihal
Copy link
Collaborator

@imaihal imaihal commented May 8, 2024

Generate report on why the operations are not run on NNPA.
The report is displayed by using --opt-report=NNPAUnsupportedOps. The format is
==NNPA-UNSUPPORTEDOPS-REPORT==, <operation name>, <node name>, <reason>

  --opt-report=<value>                                                              - Provide information on a specific compiler optimization.
    =NoReport                                                                       -   No report. Default value.
    =Parallel                                                                       -   Provide report on how OMP Parallel is applied to ONNX ops.
    =Simd                                                                           -   Provide report on how SIMD is applied to ONNX ops.
    =NNPAUnsupportedOps  

Example of xlm-roberta-base-language-detection-onnx (issue #2813):

$ onnx-mlir --O3 --mtriple=s390x-ibm-loz --mcpu=z16 --maccel=NNPA model.onnx  --EmitZHighIR --opt-report=NNPAUnsupportedOps

==NNPA-UNSUPPORTEDOPS-REPORT==, onnx.Mul, Mul_28, Element type is not F16 or F32.
==NNPA-UNSUPPORTEDOPS-REPORT==, onnx.Add, Add_31, Element type is not F16 or F32.
==NNPA-UNSUPPORTEDOPS-REPORT==, onnx.Div, Div_122, Rank 0 is not supported. zAIU only supports rank in range of (0, 4].
==NNPA-UNSUPPORTEDOPS-REPORT==, onnx.Add, Add_125, Rank 0 is not supported. zAIU only supports rank in range of (0, 4].
==NNPA-UNSUPPORTEDOPS-REPORT==, onnx.Mul, Mul_128, Rank 0 is not supported. zAIU only supports rank in range of (0, 4].
==NNPA-UNSUPPORTEDOPS-REPORT==, onnx.Div, Div_216, Rank 0 is not supported. zAIU only supports rank in range of (0, 4].
==NNPA-UNSUPPORTEDOPS-REPORT==, onnx.Add, Add_219, Rank 0 is not supported. zAIU only supports rank in range of (0, 4].
==NNPA-UNSUPPORTEDOPS-REPORT==, onnx.Mul, Mul_222, Rank 0 is not supported. zAIU only supports rank in range of (0, 4].
==NNPA-UNSUPPORTEDOPS-REPORT==, onnx.Div, Div_310, Rank 0 is not supported. zAIU only supports rank in range of (0, 4].
==NNPA-UNSUPPORTEDOPS-REPORT==, onnx.Add, Add_313, Rank 0 is not supported. zAIU only supports rank in range of (0, 4].
==NNPA-UNSUPPORTEDOPS-REPORT==, onnx.Mul, Mul_316, Rank 0 is not supported. zAIU only supports rank in range of (0, 4].
==NNPA-UNSUPPORTEDOPS-REPORT==, onnx.Div, Div_404, Rank 0 is not supported. zAIU only supports rank in range of (0, 4].
==NNPA-UNSUPPORTEDOPS-REPORT==, onnx.Add, Add_407, Rank 0 is not supported. zAIU only supports rank in range of (0, 4].
==NNPA-UNSUPPORTEDOPS-REPORT==, onnx.Mul, Mul_410, Rank 0 is not supported. zAIU only supports rank in range of (0, 4].


:
:

Signed-off-by: Haruki Imai <imaihal@jp.ibm.com>
Signed-off-by: Haruki Imai <imaihal@jp.ibm.com>
Signed-off-by: Haruki Imai <imaihal@jp.ibm.com>
@imaihal imaihal requested review from cjvolzka and tungld May 8, 2024 15:21
Copy link
Collaborator

@tungld tungld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! It's useful. Thanks for adding this!

void emitLegalityCheckMessage(Operation *op, StringRef message) {
LLVM_DEBUG(llvm::outs() << "[NNPA Legality Check] Warning: " << op->getLoc()
<< " runs on CPU. Reason: " << message << "\n");
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you print out the operation name also? Sometime the loc is just a number and it's not easy to know which operation is.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added operation name and node name. Thanks!

Signed-off-by: Haruki Imai <imaihal@jp.ibm.com>
Signed-off-by: Haruki Imai <imaihal@jp.ibm.com>
bool returnVal = (shapeA[1] == shapeB[0]);
if (!returnVal)
emitLegalityCheckMessage(
op.getOperation(), "Unstacked case, dim A 1 not equal dim B 0.");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we know the dimension size in this case, you can print out shapeA[1] and shapeB[0] as well, say Unstacked case, the 2nd dim of A (5) and the 1st dim of B (7) are not the same. if shapeA[1] =5 and shapeB[0] = 7`. Same for other static-shape cases.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks!

Signed-off-by: Haruki Imai <imaihal@jp.ibm.com>
Signed-off-by: Haruki Imai <imaihal@jp.ibm.com>
Signed-off-by: Haruki Imai <imaihal@jp.ibm.com>
} else {
std::string message = "Element type is not F16 or F32.";
emitLegalityCheckMessage(op, StringRef(message));
return false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend to change emitLegalityCheckMessage to return false, e.g bool emitLegalityCheckMessage(Operation *op, StringRef message), and then we can write:

return emitLegalityCheckMessage(op, StringRef(message));

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks. Also changed the function name.

return ((rank > 0) && (rank <= 4));
if ((rank > 0) && (rank <= 4)) {
return true;
} else {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have return in the if part, the else is not necessary. For example:

if ((rank > 0) && (rank <= 4))
     return true;

std::string message =
            "Rank " + std::to_string(rank) +
            " is not supported. zAIU only supports rank in range of (0, 4].";
return emitWarningMessageNNPAUnsupported(op, StringRef(message));

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks!

}
}
return true;
}

bool emitWarningMessageNNPAUnsupported(Operation *op, StringRef message) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to use StringRef instead of just const std::string&? I don't see any operation on message and passing a pointer looks faster than constructing a StringRef.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Thanks!

Signed-off-by: Haruki Imai <imaihal@jp.ibm.com>
Signed-off-by: Haruki Imai <imaihal@jp.ibm.com>
Signed-off-by: Haruki Imai <imaihal@jp.ibm.com>
Signed-off-by: Haruki Imai <imaihal@jp.ibm.com>
return !emitWarningMessageNNPAUnsupported(
op.getOperation(), message); /* true */

return false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be shorten in this way:

return (!sameBatchDims) || emitWarningMessageNNPAUnsupported(op.getOperation(), message);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Thanks!

Signed-off-by: Haruki Imai <imaihal@jp.ibm.com>
Signed-off-by: Haruki Imai <imaihal@jp.ibm.com>
@imaihal
Copy link
Collaborator Author

imaihal commented May 23, 2024

@tungld I created an option for this reporting, not using --debug-only. This is the same approach as --opt-report option. Is this acceptable?

@tungld
Copy link
Collaborator

tungld commented May 23, 2024

@tungld I created an option for this reporting, not using --debug-only. This is the same approach as --opt-report option. Is this acceptable?

If you take the approach of --opt-report, I recommend to add an enum value into the existing --opt-report, say --opt-report=NNPA. You can see --profile-ir to know how to mix CPU and NNPA values in a single option.

Signed-off-by: Haruki Imai <imaihal@jp.ibm.com>
@imaihal
Copy link
Collaborator Author

imaihal commented May 24, 2024

If you take the approach of --opt-report, I recommend to add an enum value into the existing --opt-report,

I updated to use --opt-report adding new enum value (NNPAUnsupportedOps). Thanks!

@tungld
Copy link
Collaborator

tungld commented May 24, 2024

I updated to use --opt-report adding new enum value (NNPAUnsupportedOps). Thanks!

When it's ready for another round of review, please let me know. Thanks!

@imaihal imaihal requested a review from tungld May 24, 2024 07:43
@imaihal
Copy link
Collaborator Author

imaihal commented May 24, 2024

Yes. ready for review.

Copy link
Collaborator

@tungld tungld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating the patch. It's much better. Just some minor comments, but the most important thing is updating the document. And, feel free to merge the patch then.

In your example of roberta, I see this

==NNPA-UNSUPPORTEDOPS-REPORT== onnx.Softmax, /roberta/encoder/layer.0/attention/self/Softmax_176, The rank of input tensor (4) needs to be 2 or 3.

It seemed it was not applied in this model. Do you know what was the reason? Softmax with rank 4 is potentially rewritten to rank 2 by RewriteONNXForZHigh pass. Perhaps axis was not the last dimension in this model ...

@@ -51,6 +51,10 @@

#define ACCEL_PROFILEIR_CL_ENUM(name) PROFILEIR_CL_ENUM_##name

#define ACCEL_OPTREPORT_ENUM(name) OPTREPORT_ENUM_##name

#define ACCEL_OPTREPORT_CL_ENUM(name) OPTREPORT_CL_ENUM_##name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you update this document to include the new macro? Thanks!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks!

bool emitWarningMessageNNPAUnsupported(
mlir::Operation *op, const std::string &message);

bool emitWarningMessageInCompatibility(mlir::Operation *op);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind if we change these function names to make it consistent with functions for other --op-report options? We have onnxToKrnlSimdReport and onnxToKrnlParallelReport, so how about onnxToZHighUnsupportedReport? or feel free to choose another name.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks!

Copy link
Collaborator

@AlexandreEichenberger AlexandreEichenberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a , after the option identifier, namely transforming

==NNPA-UNSUPPORTEDOPS-REPORT== onnx.Mul, /roberta/embeddings/Mul, Element type is not F16 or F32.

to

==NNPA-UNSUPPORTEDOPS-REPORT==,  onnx.Mul, /roberta/embeddings/Mul, Element type is not F16 or F32.

so that is is like all of the other perf measurements; it let you do CSV imports more easily.

Great addition, thanks.

Copy link
Collaborator

@AlexandreEichenberger AlexandreEichenberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving the PR, but please add the missing comma in the printout to be compatible with the other tools.

@imaihal
Copy link
Collaborator Author

imaihal commented Jun 5, 2024

@tungld

In your example of roberta, I see this

==NNPA-UNSUPPORTEDOPS-REPORT== onnx.Softmax, /roberta/encoder/layer.0/attention/self/Softmax_176, The rank of input tensor (4) needs to be 2 or 3.

It seemed it was not applied in this model. Do you know what was the reason? Softmax with rank 4 is potentially rewritten to rank 2 by RewriteONNXForZHigh pass. Perhaps axis was not the last dimension in this model ...

I confirmed all softmax ops are lowered to zhigh.softmax in ZHighIR.

I found this message was generated in DevicePlacementPass which also calls legality check. In the DevicePlacementPass, is ONNXToZHighPass called after RewriteONNXForZHighPass? If not and ONNXToZHighPass checks original softmax op, I think this message is generated.

Signed-off-by: Haruki Imai <imaihal@jp.ibm.com>
Signed-off-by: Haruki Imai <imaihal@jp.ibm.com>
@imaihal
Copy link
Collaborator Author

imaihal commented Jun 5, 2024

Could you add a , after the option identifier, namely transforming

==NNPA-UNSUPPORTEDOPS-REPORT== onnx.Mul, /roberta/embeddings/Mul, Element type is not F16 or F32.

to

==NNPA-UNSUPPORTEDOPS-REPORT==,  onnx.Mul, /roberta/embeddings/Mul, Element type is not F16 or F32.

Done. Thanks!

@tungld
Copy link
Collaborator

tungld commented Jun 5, 2024

I confirmed all softmax ops are lowered to zhigh.softmax in ZHighIR.

So the printed message was incorrect. I think you would fix that before merging.

is ONNXToZHighPass called after RewriteONNXForZHighPass?

Yes, it is in the code, but they are independent calls and used for analysis. The results of each call will be merged later. You can take a look at: https://github.com/onnx/onnx-mlir/blob/main/src/Accelerators/NNPA/Conversion/ONNXToZHigh/DevicePlacement.cpp#L196

Signed-off-by: Haruki Imai <imaihal@jp.ibm.com>
@imaihal
Copy link
Collaborator Author

imaihal commented Jun 5, 2024

@tungld

Thanks for the comment!. I updated the PR to enable the reporting only in ONNXToZHighPass and disable it in DevicePlacementPass. This removes reporting about Softmax, as described in updated description of this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants