PodMonitor vs ServiceMonitor what is the difference? #3119

mandreasik · 2020-04-03T14:28:51Z

Hi, I'm new to prometheus-operator and I do not see difference between ServiceMonitor and PodMonitor.

Documentation describes them as:

ServiceMonitor, which declaratively specifies how groups of services should be monitored. The Operator automatically generates Prometheus scrape configuration based on the definition.

PodMonitor, which declaratively specifies how groups of pods should be monitored. The Operator automatically generates Prometheus scrape configuration based on the definition.

But this does not explain when to use particular one.

I played with them a little bit. Both of them use selector rules to find pods.

So you can either define PodMonitor that search for pods with eg. label app: myAPP, or if your app is behind service that is labeled eg. app:myAPP, you create ServiceMonitor that will scrape metrics from all pods in service with this label. Both of them will produce the same result in prometheus target tab in prometheus UI.

So I would really appreciate if someone will explain to me when one should use PodMonitor and when ServiceMonitor?

pgier · 2020-04-03T15:49:06Z

I think ServiceMonitor should be the default choice unless you have some reason to use PodMonitor. For example, you may want to scrape a set of pods which all have a certain label which is not consistent between different services.

There is some helpful discussion in the original podmonitor issue #38, but I agree that the docs could be improved here.

brancz · 2020-04-07T15:12:28Z

The way I see it, ServiceMonitors are the perfect fit if you already have a Service for your pods anyways. However, if it doesn't make sense for other reasons to have a Service for your component, then the PodMonitor is the right choice. It's essentially about avoiding (failure prone) duplicating pod label selectors when possible.

mandreasik · 2020-04-08T08:48:17Z

@pgier @brancz Thank you for the explanation :) (now it's more clear to me when to use them)

I hope this issue will help other prometheus-operator newbies understand this matter better :)

winningjai · 2020-05-19T08:02:00Z

@pgier @brancz,
I have a doubt about this.

Consider I need to collect metrics from all the pods under a particular service. What I need to prefer PodMonitor or ServiceMonitor.
If I used ServiceMonitor, will Prometheus server scrape all the nodes under the service or it will scrape just the service endpoint alone.

Thanks in advance for your time.

brancz · 2020-05-20T11:13:35Z

It will scrape all pods behind the service, because the Service maintains an Endpoints object. That's where Prometheus discovers the targets from.

yann-soubeyrand · 2020-07-08T17:30:52Z

Hello,

I should say that I'm a little bit confused too: when talking about ServiceMonitor, I would expect my service to be monitored, not the associated Endpoints (otherwise I would have been looking for an EndpointsMonitor). This would be particularly useful when one has a highly available exporter and wants only one of the exporter's pods to be scraped each time to avoid having duplicate metrics and to avoid uselessly loading the service which the exporter requests.

Without using PrometheusOperator, one would use a kubernetes_sd_config with a role of type service in this case. Prometheus would then scrape the service IP, which would randomly scrape one of the exporter's pods. That's precisely what I thought the ServiceMonitor did in the first place due to its naming (ServiceMonitor let one think that it will be translated to kubernetes_sd_config with service role).

How can we achieve this with the actual ServiceMonitor and PodMonitor? Is it too late to rename the actual ServiceMonitor to EndpointsMonitor and create a “real” ServiceMonitor?

paulfantom · 2020-07-09T07:50:59Z

@yann-soubeyrand scraping metric endpoints via a load balancer is an antipattern in prometheus. Best practice is to scrape directly and this is why ServiceMonitor is examining all endpoints in a Service and scraping them this way.

yann-soubeyrand · 2020-07-09T08:38:38Z

@paulfantom so how do you achieve high availability for your exporters? Do you scrape all the replicas of your exporters and then get duplicated metrics and useless load on your applications (elasticsearch or postgresql exporters can be quite demanding on the server they get the metrics from)?

yann-soubeyrand · 2020-07-13T09:52:24Z

Hi @paulfantom, that's a real question, I'd really like to know what's the right approach for the scraping of highly available exporters if scraping them through a load balancer is an antipattern?

paulfantom · 2020-07-13T10:13:13Z

Highly available exporter is an anti-pattern in itself and that has a consequence of #3119 (comment). The best practice is to have an exporter near to an instance that is being monitored.

However, if you need to do this then additionalScrapeConfigs can be helpful

yann-soubeyrand · 2020-07-15T09:01:11Z

Thanks for your answer @paulfantom. In case one cannot have the exporter near the instance being monitored (like a PostgreSQL instance on AWS RDS), what's the best practice for the deployment of the exporter? Should one deploy a single instance of it and use a PodMonitor (and accept that there can be holes in the metrics when the exporter gets unavailable, which can happen for various reasons like a k8s node being drained in response to cluster scale-in)?

paulfantom · 2020-07-16T11:18:17Z

In such case, you need to deploy at least 2 instances and deduplicate data on prometheus level.

jon-walton · 2020-07-30T02:57:37Z

Hi @paulfantom I'm in the same situation. I have a redis exporter that cannot be close to redis itself (Azure managed redis). What is the best practice for deduping data at the prometheus level without thanos or cortex?

(sorry, I've searched but only came up with thanos/cortex as a solution. I'm hopeful for a short term solution while I plan out a cortex deployment)

brancz · 2020-07-30T05:46:02Z

No matther whether you are using Cortex or Thanos, you will always want two Prometheus instances for high availability reasons. If you don't have control over the targets themselves, then I recommend still running 1 exporter per process of a system.

For query time deduplication, you can just use the Thanos querier and none of the other components, which would simplify your setup a lot if that's all you are looking for.

yxyx520 · 2020-08-04T07:56:54Z

I hope this issue will help other prometheus-operator newbies understand this matter better :)

yep, it helps me a lot.

gmintoco · 2020-11-03T11:38:55Z

While this is an older issue I would make the argument that potentially exporters that Prometheus scrapes (like the SNMP exporter, or BlackBox exporter) are almost anti-patterns in themselves (in that the scraping occurs via a proxy). However, these are in widespread use because of their important functionality (actually the only functionality that makes Prometheus viable for us).

In these cases, it makes a lot of sense for these exporters to be highly available and load-balanced as they are effectively proxies. While I agree that this load balancing is an anti-pattern I would not consider exporters a "service" (in the sense of monitoring other stuff via them, it makes sense to monitor the exporter itself but that is already covered by the service monitor).

I think it would make sense to support these common use cases simply without having to setup multiple Prometheus instances and deuplication (this also enabled independent scaling of exporters and prometheus, as duplicating Prometheus instances is quite demanding in a lot of environments) with something like a ServiceMonitor that actually scrapes the service IP.

Happy to hear feedback if that makes sense, but just my 2 cents :)

coderanger · 2020-11-03T12:29:29Z

While this is an older issue I would make the argument that potentially exporters that Prometheus scrapes (like the SNMP exporter, or BlackBox exporter) are almost anti-patterns in themselves (in that the scraping occurs via a proxy). However, these are in widespread use because of their important functionality (actually the only functionality that makes Prometheus viable for us).

In these cases, it makes a lot of sense for these exporters to be highly available and load-balanced as they are effectively proxies. While I agree that this load balancing is an anti-pattern I would not consider exporters a "service" (in the sense of monitoring other stuff via them, it makes sense to monitor the exporter itself but that is already covered by the service monitor).

I think it would make sense to support these common use cases simply without having to setup multiple Prometheus instances and deuplication (this also enabled independent scaling of exporters and prometheus, as duplicating Prometheus instances is quite demanding in a lot of environments) with something like a ServiceMonitor that actually scrapes the service IP.

Happy to hear feedback if that makes sense, but just my 2 cents :)

I don't think that's related to this original question, or that you misunderstand how ServiceMonitors work. They use the service's endpoints directly, not the kube-proxy balancer itself.

gmintoco · 2020-11-03T12:42:10Z

While this is an older issue I would make the argument that potentially exporters that Prometheus scrapes (like the SNMP exporter, or BlackBox exporter) are almost anti-patterns in themselves (in that the scraping occurs via a proxy). However, these are in widespread use because of their important functionality (actually the only functionality that makes Prometheus viable for us).
In these cases, it makes a lot of sense for these exporters to be highly available and load-balanced as they are effectively proxies. While I agree that this load balancing is an anti-pattern I would not consider exporters a "service" (in the sense of monitoring other stuff via them, it makes sense to monitor the exporter itself but that is already covered by the service monitor).
I think it would make sense to support these common use cases simply without having to setup multiple Prometheus instances and deuplication (this also enabled independent scaling of exporters and prometheus, as duplicating Prometheus instances is quite demanding in a lot of environments) with something like a ServiceMonitor that actually scrapes the service IP.
Happy to hear feedback if that makes sense, but just my 2 cents :)

I don't think that's related to this original question, or that you misunderstand how ServiceMonitors work. They use the service's endpoints directly, not the kube-proxy balancer itself.

I am talking about a ServiceMonitor "like" resource using the service IP instead of the endpoint IPs for use case as outlined by yann-soubeyrand and discussed with brancz and paulfantom stance on this being an anti pattern and therefore wouldn't make sense to support and following on from this why I thought it would make sense as something that may be considered. I can see however that this might make more sense as a feature request rather than a comment here, but I thought it was relevant given the discussion above.

v-pap · 2020-11-18T19:35:32Z

I think that one way to monitor a service without scraping data from all its pods, is to use the uri of the service <service-name>.<namespace>.svc.cluster.local:<service-port> in an additional-scrape-configs secret instead of using a ServiceMonitor.

lfrittelli · 2020-12-02T19:44:52Z

Although one can make a very reasonable argument that Prometheus should aim to scrape endpoints and not services, would you not say that it is at least confusing to have a CRD called "ServiceMonitor" that monitors endpoints when Prometheus configuration does support monitoring both endpoints and services independently? At least for educational purposes I'd recommend renaming it or perhaps at least provide further clarification of this fact in the ServiceMonitor documentation.

michael1011101 · 2021-01-12T12:17:15Z

I think that one way to monitor a service without scraping data from all its pods, is to use the uri of the service <service-name>.<namespace>.svc.cluster.local:<service-port> in an additional-scrape-configs secret instead of using a ServiceMonitor.

Hi @v-pap , I am confused why prometheus need to monitor a service without scraping data. And why do we need to monitor service without data scraping? :(

joshuasimon-taulia · 2021-01-27T22:37:43Z

I think that one way to monitor a service without scraping data from all its pods, is to use the uri of the service <service-name>.<namespace>.svc.cluster.local:<service-port> in an additional-scrape-configs secret instead of using a ServiceMonitor.

is it possible to abuse the Probe crd to accomplish the same thing?

costimuraru · 2021-03-11T10:22:28Z

The way I see it, ServiceMonitors are the perfect fit if you already have a Service for your pods anyways. However, if it doesn't make sense for other reasons to have a Service for your component, then the PodMonitor is the right choice. It's essentially about avoiding (failure prone) duplicating pod label selectors when possible.

What happens when pods are registered in multiple Services - each with their own ServiceMonitor? The pods will be scraped multiple times, leading to duplicated metrics. In that scenario PodMonitor might make more sense.

brancz · 2021-03-12T12:07:04Z

This can potentially happen yes. The recommendation is to narrow down with labels when this happens.

stale · 2021-06-14T14:20:08Z

This issue has been automatically marked as stale because it has not had any activity in the last 60 days. Thank you for your contributions.

kong62 · 2021-06-25T02:43:28Z

PodMonitor - no service
ServieMonitor - must has service

coderanger · 2021-06-25T02:44:53Z

Going to close this out as I think this issue has outlived its usefulness.

xianyuLuo · 2021-11-03T08:12:33Z

podmonitor don't need services ^--^

kamazee · 2023-06-21T10:34:34Z

I have a case that hasn't been covered in the discussion and hence I wonder what's the best practice is for my case.

There is a service which is scaled to hundreds of instances; there is a business metric that is exposed as a gauge and (under the hood) is stored in a database that all of the instances of the service can access, and the metric is related to a service as a whole, not to any specific instance. So, I want to let Prometheus understand that I want the scraper to make a single request to my service within a specified interval (there are different metrics that I plan to scrape hourly and others that I'd like to scrape daily). Although I read that scraping off a load balancer is a bad practice (and I kind of accept that it is when we're talking about monitoring specific instances), are there any better options for my case?

mandreasik added the kind/support label Apr 3, 2020

alvaroaleman mentioned this issue Jun 25, 2020

Improve app.ci monitoring openshift/release#9929

Merged

ppatierno mentioned this issue Jul 21, 2020

Using PodMonitor for scraping Kafka related metrics strimzi/strimzi-kafka-operator#3351

Merged

1 task

choover-broad mentioned this issue Aug 6, 2020

DDO-466 jvm cromwell k8s broadinstitute/terra-helm#100

Merged

stale bot added the stale label Jun 14, 2021

stale bot removed the stale label Jun 25, 2021

coderanger closed this as completed Jun 25, 2021

toshipp mentioned this issue Oct 5, 2021

Enables exposing metrics to Prometheus by Chart topolvm/pvc-autoresizer#72

Closed

alvaroaleman mentioned this issue Oct 5, 2021

Scrape metrics from controlplaneoperator into UWM openshift/hypershift#524

Merged

webup mentioned this issue Dec 1, 2021

feat(apisix): add service monitor support apache/apisix-helm-chart#183

Closed

pclalv mentioned this issue Aug 30, 2022

improved support for use with prometheus-operator traefik/traefik-helm-chart#626

Closed

2 tasks

nourspace mentioned this issue Sep 1, 2022

feat: Enable prometheus tendermint metrics strangelove-ventures/cosmos-operator#96

Merged

wkloucek mentioned this issue Nov 2, 2022

add healthchecks and prepare monitoring owncloud/ocis-charts#95

Merged

5 tasks

nicolasochem mentioned this issue Nov 3, 2022

Service monitor oxheadalpha/tezos-k8s#501

Merged

wilmardo mentioned this issue Jan 23, 2024

[bitnami/redis] Fix the PodMonitor implementation bitnami/charts#22676

Merged

4 tasks

erikgb mentioned this issue Feb 24, 2024

Add support for PodMonitor cert-manager/trust-manager#304

Open

joelanford mentioned this issue May 21, 2024

RFE: Replace ServiceMonitor for PodMonitor kubernetes-sigs/kubebuilder#3936

Closed

sivanantha321 mentioned this issue May 21, 2024

Runtime Specific Metrics via ServiceMonitor (with qpext) kserve/kserve#3696

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PodMonitor vs ServiceMonitor what is the difference? #3119

PodMonitor vs ServiceMonitor what is the difference? #3119

mandreasik commented Apr 3, 2020

pgier commented Apr 3, 2020

brancz commented Apr 7, 2020

mandreasik commented Apr 8, 2020

winningjai commented May 19, 2020

brancz commented May 20, 2020

yann-soubeyrand commented Jul 8, 2020

paulfantom commented Jul 9, 2020

yann-soubeyrand commented Jul 9, 2020 •

edited

yann-soubeyrand commented Jul 13, 2020

paulfantom commented Jul 13, 2020

yann-soubeyrand commented Jul 15, 2020

paulfantom commented Jul 16, 2020

jon-walton commented Jul 30, 2020

brancz commented Jul 30, 2020

yxyx520 commented Aug 4, 2020

gmintoco commented Nov 3, 2020

coderanger commented Nov 3, 2020

gmintoco commented Nov 3, 2020

v-pap commented Nov 18, 2020 •

edited

lfrittelli commented Dec 2, 2020

michael1011101 commented Jan 12, 2021

joshuasimon-taulia commented Jan 27, 2021

costimuraru commented Mar 11, 2021

brancz commented Mar 12, 2021

stale bot commented Jun 14, 2021

kong62 commented Jun 25, 2021

coderanger commented Jun 25, 2021

xianyuLuo commented Nov 3, 2021

kamazee commented Jun 21, 2023

PodMonitor vs ServiceMonitor what is the difference? #3119

PodMonitor vs ServiceMonitor what is the difference? #3119

Comments

mandreasik commented Apr 3, 2020

pgier commented Apr 3, 2020

brancz commented Apr 7, 2020

mandreasik commented Apr 8, 2020

winningjai commented May 19, 2020

brancz commented May 20, 2020

yann-soubeyrand commented Jul 8, 2020

paulfantom commented Jul 9, 2020

yann-soubeyrand commented Jul 9, 2020 • edited

yann-soubeyrand commented Jul 13, 2020

paulfantom commented Jul 13, 2020

yann-soubeyrand commented Jul 15, 2020

paulfantom commented Jul 16, 2020

jon-walton commented Jul 30, 2020

brancz commented Jul 30, 2020

yxyx520 commented Aug 4, 2020

gmintoco commented Nov 3, 2020

coderanger commented Nov 3, 2020

gmintoco commented Nov 3, 2020

v-pap commented Nov 18, 2020 • edited

lfrittelli commented Dec 2, 2020

michael1011101 commented Jan 12, 2021

joshuasimon-taulia commented Jan 27, 2021

costimuraru commented Mar 11, 2021

brancz commented Mar 12, 2021

stale bot commented Jun 14, 2021

kong62 commented Jun 25, 2021

coderanger commented Jun 25, 2021

xianyuLuo commented Nov 3, 2021

kamazee commented Jun 21, 2023

yann-soubeyrand commented Jul 9, 2020 •

edited

v-pap commented Nov 18, 2020 •

edited