You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Ensure we can inject GPU and CPU ooms separately. Currently we inject an oom, and the first allocation that happens from the host or gpu will trigger it.
We would also like to add options so we don't always inject the oom on the first allocation. An option to inject on the Nth allocation would be good. I believe there was talk about randomizing the allocation on which we fail, but I am not entirely sure how that would work if a unit test depends on it, but adding it here for consideration.
The text was updated successfully, but these errors were encountered:
With the addition of #1543 and even before this, we have been thinking about improving the OOM injection mechanism
spark-rapids-jni/src/main/java/com/nvidia/spark/rapids/jni/SparkResourceAdaptor.java
Lines 185 to 211 in 9c3c7a6
We would like to do two things:
The text was updated successfully, but these errors were encountered: