Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bring your own Spark model #2080

Open
brandongreenwell-8451 opened this issue Sep 21, 2023 · 3 comments
Open

Bring your own Spark model #2080

brandongreenwell-8451 opened this issue Sep 21, 2023 · 3 comments
Labels

Comments

@brandongreenwell-8451
Copy link

Is your feature request related to a problem? Please describe.
I'm wondering if it would be possible to "bring your own" Spark model for use with the interpretability functions, like ICETransformer()? For instance, we have several tools that allow us to do inference on Spark data frames by calling a .predict() or .transform() method. Is it possible to wrap such a non-PySpark MLLib model in a way that we could still use this package for generating explanations and ICE plots in Spark?

Describe the solution you'd like
Suppose I have a custom model (e.g., some kind of scoring code that operates on Spark data frames to transform the data by adding prediction columns). I'd like to wrap said model in a way that would allow me to call the ICETransformer() method. E.g.,

pdp = ICETransformer(
    model=custom_model,  # custom_model.transform(spark_data_frame) 
    targetCol="prediction_col_name",
    kind="average",
    targetClasses=[1],
    categoricalFeatures=categorical_features,
    numericFeatures=numeric_features,
)

Where custom_model has a .transform() method similar to PySpark MLLib models that returns the input data with additional prediction columns.

@github-actions
Copy link

Hey @brandongreenwell-8451 👋!
Thank you so much for reporting the issue/feature request 🚨.
Someone from SynapseML Team will be looking to triage this issue soon.
We appreciate your patience.

@memoryz
Copy link
Contributor

memoryz commented Oct 30, 2023

@brandongreenwell-8451 sorry for the delayed response - I just saw this question.

I assume you have a custom model but it is not implemented as a Spark Transformer object. If that's the case, I'm afraid the current implementation of explainers do not support such a scenario. The explainer logic is implemented entirely in scala, and I don't see a way to bring a python model to JVM side for interpretation.

@brandongreenwell-8451
Copy link
Author

brandongreenwell-8451 commented Oct 31, 2023

Hi @memoryz, thanks for the reply, and that makes sense. I was more so asking about models that do operate on Spark data frames via a .predict() or .transform() method. For example, DataRobot and H2O both provide scoring code for models that can be used to make predictions on Spark data frames (e.g., in Python/pyspark). Is it possible to create a wrapper of some sort that would allow us to use it with some of SynapseML's RAI functions, like ICE curves?

Example code: https://datarobot.github.io/datarobot-predict/1.5/scoring_code_spark/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants