Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]:"Failed to init storage factory","error":"failed to create primary Elasticsearch client: health check timeout: no Elasticsearch node available" #443

Open
navin-rai opened this issue Feb 1, 2023 · 6 comments
Labels
bug Something isn't working

Comments

@navin-rai
Copy link

What happened?

I am using AWS Elasticsearch and trying to use it in jaeger, I have set the endpoints as per documentation, I am using latest version of jaeger helm chart apiVersion: v2, appVersion: 1.39.0, below is the config I am using for elasticsearch
elasticsearch:
scheme: https
host: search-*********************.us-east-1.es.amazonaws.com
port: 443
user: elastic
usePassword: true
password: *********

Steps to reproduce

Add AWS ES endpoints
helm install jaeger

Expected behavior

Jaeger Collector and Jaeger Query should deploy properly on Kubernetes.

Relevant log output

2023/02/01 13:18:20 maxprocs: Leaving GOMAXPROCS=24: CPU quota undefined
{"level":"info","ts":1675257500.4474466,"caller":"flags/service.go:119","msg":"Mounting metrics handler on admin server","route":"/metrics"}
{"level":"info","ts":1675257500.447506,"caller":"flags/service.go:125","msg":"Mounting expvar handler on admin server","route":"/debug/vars"}
{"level":"info","ts":1675257500.4477003,"caller":"flags/admin.go:129","msg":"Mounting health check on admin server","route":"/"}
{"level":"info","ts":1675257500.4477556,"caller":"flags/admin.go:143","msg":"Starting admin HTTP server","http-addr":":14269"}
{"level":"info","ts":1675257500.447806,"caller":"flags/admin.go:121","msg":"Admin server started","http.host-port":"[::]:14269","health-status":"unavailable"}
{"level":"fatal","ts":1675257506.1148498,"caller":"./main.go:82","msg":"Failed to init storage factory","error":"failed to create primary Elasticsearch client: health check timeout: no Elasticsearch node available","stacktrace":"main.main.func1\n\t./main.go:82\ngithub.com/spf13/cobra.(*Command).execute\n\tgithub.com/spf13/cobra@v1.6.1/command.go:916\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\tgithub.com/spf13/cobra@v1.6.1/command.go:1044\ngithub.com/spf13/cobra.(*Command).Execute\n\tgithub.com/spf13/cobra@v1.6.1/command.go:968\nmain.main\n\t./main.go:155\nruntime.main\n\truntime/proc.go:250"}

Screenshot

No response

Additional context

No response

Jaeger backend version

No response

SDK

No response

Pipeline

No response

Stogage backend

AWS Elasticsearch

Operating system

No response

Deployment model

No response

Deployment configs

No response

@navin-rai navin-rai added the bug Something isn't working label Feb 1, 2023
@mehta-ankit
Copy link
Member

@navin-rai is your issue similar to this one: #441 by any chance ?
I don't use opensearch so I don't know what could go wrong with it when using it with jaeger deployed using this helm chart.

@navin-rai
Copy link
Author

@navin-rai is your issue similar to this one: #441 by any chance ? I don't use opensearch so I don't know what could go wrong with it when using it with jaeger deployed using this helm chart.

I tried the solution given in PR, but it didn't work.

@klubi
Copy link
Contributor

klubi commented Feb 1, 2023

@navin-rai did you enable fine-grained-access on OpenSearch domain? If yes, then proper credentials must be provided, if not, then you can't pass username nor password as environment variables.
Another thing is AWS level policies, can you confirm that pods running in your cluster are able to correctly resolve OpenSearch address?

@navin-rai
Copy link
Author

@klubi , Hi, So here is the thing what I am trying to do, I have AWS ES created, My jaeger instance is not on AWS it is on prem. The solution which you provided gives me below manifest for collector-deployment(similar for query-deployment)

Source: jaeger/templates/collector-deploy.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
name: jaeger-collector
labels:
helm.sh/chart: jaeger-0.67.0
app.kubernetes.io/name: jaeger
app.kubernetes.io/instance: jaeger
app.kubernetes.io/version: "1.39.0"
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/component: collector
spec:
selector:
matchLabels:
app.kubernetes.io/name: jaeger
app.kubernetes.io/instance: jaeger
app.kubernetes.io/component: collector
template:
metadata:
annotations:
checksum/config-env: dba5166ad9db9ba648c1032ebbd34dcd0d085b50023b839ef5c68ca1db93a563
labels:
app.kubernetes.io/name: jaeger
app.kubernetes.io/instance: jaeger
app.kubernetes.io/component: collector
spec:
securityContext:
{}
serviceAccountName: jaeger-collector
containers:
- name: jaeger-collector
securityContext:
{}
image: jaegertracing/jaeger-collector:1.39.0
imagePullPolicy: IfNotPresent
args:
env:
- name: SPAN_STORAGE_TYPE
value: elasticsearch
- name: ES_SERVER_URLS
value: https://search-elastic-****-***********.us-east-1.es.amazonaws.com:443
- name: ES_USERNAME
value: elastic
- name: ES_PASSWORD
valueFrom:
secretKeyRef:
name: jaeger-elasticsearch
key: password
- name: ES_INDEX_PREFIX
value: jaeger
ports:
- containerPort: 14250
name: grpc
protocol: TCP
- containerPort: 14268
name: http
protocol: TCP
- containerPort: 14269
name: admin
protocol: TCP
readinessProbe:
httpGet:
path: /
port: admin
initialDelaySeconds: 20
livenessProbe:
httpGet:
path: /
port: admin
initialDelaySeconds: 20
resources:
{}
volumeMounts:
dnsPolicy: ClusterFirst
restartPolicy: Always
volumes:

@navin-rai
Copy link
Author

@klubi I am not sure, is there any possibility to use AWS Secret key & Access key ?

@klubi
Copy link
Contributor

klubi commented Feb 1, 2023

No, that's a completely different mechanism.
My PR was not merged yet, so you can't use it yet.
What you can do to test your case is remove below lines from generated manifest.

- name: ES_USERNAME
  value: elastic
- name: ES_PASSWORD
  valueFrom:
    secretKeyRef:
      name: jaeger-elasticsearch

also, you'd have to add below to your collector values

cmdlineParams:
      es.tls.enabled: true
      es.tls.skip-host-verify: true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants