How to apply EnvoyFilter to InferenceService? #3100
Unanswered
shauryagoel
asked this question in
Q&A
Replies: 2 comments 1 reply
-
I use |
Beta Was this translation helpful? Give feedback.
0 replies
-
Here is my kserve service- ---
apiVersion: serving.kserve.io/v1beta1
kind: InferenceService
metadata:
name: sklearn-iris
namespace: kserve-test
labels:
app: sklearn-iris
spec:
predictor:
model:
modelFormat:
name: sklearn
storageUri: gs://kfserving-examples/models/sklearn/1.0/model
Here is my rate limit EnvoyFilter- ---
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: filter-iris-local-ratelimit-svc
namespace: istio-system
# namespace: kserve-test
spec:
workloadSelector:
labels:
app: sklearn-iris
# istio: ingressgateway
configPatches:
- applyTo: HTTP_FILTER
match:
context: SIDECAR_INBOUND
listener:
filterChain:
filter:
name: envoy.filters.network.http_connection_manager
patch:
operation: INSERT_BEFORE
value:
name: envoy.filters.http.local_ratelimit
typed_config:
'@type': type.googleapis.com/udpa.type.v1.TypedStruct
type_url: type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
value:
stat_prefix: http_local_rate_limiter
# enable_x_ratelimit_headers: DRAFT_VERSION_03
token_bucket:
max_tokens: 2
tokens_per_fill: 1
fill_interval: 10s
filter_enabled:
runtime_key: local_rate_limit_enabled
default_value:
numerator: 100
denominator: HUNDRED
filter_enforced:
runtime_key: local_rate_limit_enforced
default_value:
numerator: 100
denominator: HUNDRED
response_headers_to_add:
- append: false
header:
key: x-local-rate-limit
value: 'true'
I am hitting the kserve service with a curl request continuously, but, the service is not rate limited, I keep getting the 200 response code. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I want to apply rate limiting, authentication, authorization, gzip de/compression of payloads etc. to my Kserve service. Some of these features are available in Istio via
EnvoyFilter
and some using other methods. However, I am not able to use these features. For example, rate limiting is not working on myInferenceService
, but, works on my custom service which I deployed by makingDeployment
andService
.I believe this is because the sidecar container is running a modified version of envoy proxy. Is there any way to use
EnvoyFilter
with kserve? Or is there some other way to use these features in Kserve?Beta Was this translation helpful? Give feedback.
All reactions