-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prediction explanations: display Index ID for best/worst prediction explanations #1119
Comments
@gsheni what do you mean by "the Index of the explanations"? What is the "index" in this context? Are you asking for this information to a) appear visually (which I believe it already does), b) appear in the new JSON response @freddyaboulton is adding, in which case why not just use the position in the list as the index? |
Each explanation is a reference to a element (instance) in X. That instance should have an index. I think that index should be in the report. I don't believe this information appears visually. I did not see it in the docs: |
@gsheni is your example right? for a single explanation, there should only be 1 index value, why do you have it different for each feature? |
@gsheni ah, so you want to know, for each prediction explanation which was generated, what was the index in the features dataframe? If so, doesn't the caller always know that index? Because in order to call prediction explanations, you need to pass in some rows. And in order to pass in rows, you need to select which rows to pass in :) |
You are right that we don't currently show the feature DF index value in the prediction explanations returned by evalml. I guess I hadn't considered that as adding value since the caller has to know that info in order to call. If I'm misunderstanding please let me know |
@kmax12 yes, you are right. Fixed the printout example. @dsherry Yes, I suppose the caller could get that information if they wanted to. It would require the caller re-run the following (outside of explain_predictions_best_worst) (regression) y_pred = pipeline.predict(input_features)
errors = metric(y_true, y_pred)
sorted_scores = errors.sort_values()
best = sorted_scores.index[:num_to_explain]
worst = sorted_scores.index[-num_to_explain:]
|
@gsheni Yea you're right. I agree that adding the index value to the output of |
Thanks all for the clarification. Yep agreed. I put this in the icebox because I don't think its high priority. If anyone feels different, let's talk. |
Goal
If users intend to understand the best and worst predictions, the API should allow them to see the Index of the explanations.
If the user can provide the index column (perhaps through DataTables), that index ID could be displayed in the reported outputted by
explain_predictions_best_worst
.If the user doesn't provide an index column, no index ID should be displayed in the report.
Proposal
Add index ID to the prediction explanation, if the user provides an index column in X.
Note the Index ID:
The text was updated successfully, but these errors were encountered: