Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Splunk Operator: splunk_indexer : Remove existing HEC token results in failed indexer pod startup #1318

Open
gjanders opened this issue Apr 10, 2024 · 1 comment
Assignees
Labels

Comments

@gjanders
Copy link
Contributor

gjanders commented Apr 10, 2024

Please select the type of request

Bug

Tell us more

Describe the request

  • I'm upgrading from splunk operator 2.2.0 to version 2.5.2, and also attempting to use Splunk 9.1.4.

Expected behavior

  • The indexer pods should start without an error

Splunk setup on K8S

  • Multisite cluster with cluster manager

Reproduction/Testing steps

  • When the pods are upgraded they throw an error, rolling back the splunk image to version docker.io/splunk/splunk:9.0.3-a2 stops the issue occurring so I'm unsure if this is an issue in the docker image or operator or a combination.

Additional context(optional)
In the operator I used:

image:
  repository: docker.io/splunk/splunk:9.1.4

splunkOperator:
  enabled: true
  clusterWideAccess: true

  # Specify volumes for Splunk Operator pod, append additional volumes to list
  # reference: https://kubernetes.io/docs/concepts/storage/volumes/
  volumes:
  - name: app-staging
    persistentVolumeClaim:
      claimName: splunk-operator-app-download

  # Specify volume mounts for the manager container, append additional volume mounts to list
  # reference: https://kubernetes.io/docs/tasks/configure-pod-container/configure-volume-storage/
  volumeMounts:
  - mountPath: /opt/splunk/appframework/
    name: app-staging

The logs show:

│ TASK [splunk_indexer : Remove existing HEC token] ******************************                                                                                                                                │
│ fatal: [localhost]: FAILED! => {                                                                                                                                                                                │
│     "changed": false,                                                                                                                                                                                           │
│     "elapsed": 0,                                                                                                                                                                                               │
│     "redirected": false,                                                                                                                                                                                        │
│     "status": -1,                                                                                                                                                                                               │
│     "url": "https://127.0.0.1:8089/services/data/inputs/http/splunk_hec_token",                                                                                                                                 │
│     "warnings": [                                                                                                                                                                                               │
│         "Module did not set no_log for password"                                                                                                                                                                │
│     ]                                                                                                                                                                                                           │
│ }                                                                                                                                                                                                               │
│                                                                                                                                                                                                                 │
│ MSG:                                                                                                                                                                                                            │
│                                                                                                                                                                                                                 │
│ Status code was -1 and not [200, 404]: Request failed: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self signed certificate in certificate chain (_ssl.c:1091)>                   │
│                                                                                                                                                                                                                 │
│ PLAY RECAP *********************************************************************                                                                                                                                │
│ localhost                  : ok=82   changed=8    unreachable=0    failed=1    skipped=61   rescued=0    ignored=0                                                                                              │

I've tested adding SSL certificates into the deployment without success so far.
The cluster manager pod doesn't seem to have an issue here, only the indexer pods

Under defaults: I tested:

    config:
      env:
        verify: false

And also setting SSL config via;

defaults:
  splunk:
    ssl:
      ca: /mnt/peers-splunk-ca/tls.crt
      cert: /mnt/peers-splunk-cert/tls.crt

Without any success

@satellite-no
Copy link
Contributor

We believe this to be due to the verify flag in the underlying splunk-ansible configuration steps. Open PR https://github.com/splunk/splunk-ansible/pull/818/files is to address this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants