New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate installing over OKD/CRC #342
Comments
Running CRC is pretty simple. Download the binary from the page above (it's pretty hefty though). You'll probably want to run some configurations to run a cluster, as initially it allocates few resources. I used the following (but I have 16 cores on my machine and 32Gi, so YMMV). This allocates 12 CPUs, 16GiB memory, and 256 GiB of disk space. crc config set consent-telemetry no
crc config set cpus 12
crc config set disk-size 256
crc config set enable-cluster-monitoring true
crc config set kubeadmin-password kubeadmin
crc config set memory 16384
crc config set nameserver 1.1.1.1 Enabling the cluster monitoring stack requires at least 14GiB allocated of memory, but it will provide you with Prometheus out of the box. Then simply run: crc setup --check-only
crc setup
crc start In the meantime, you can check the status to see when it's ready: watch -n1 -- crc status Starting takes quite a while the first time (~15 minutes for me), but should be faster after. You can also use Next you'll want to login/get your kube context, using the OpenShift client oc login -u developer -p developer https://api.crc.testing:6443 This will create the kube context, which is now accessible using So you'd do: oc new-project test Which will create the However, note that CRC doesn't come with any default PVC providers. So before we can deploy the Helm chart, we'll need create one. You can follow these instructions. I went with the
|
Zeebe
ElasticAs expected, the Elastic Helm chart has support already for OpenShift, so it just needs to be properly configured as shown here. Primarily you need to set the right security context, disable the elasticsearch:
enabled: false
imageTag: 7.16.2
replicas: 1
minimumMasterNodes: 1
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "local-path"
resources:
requests:
storage: 50Gi
esJavaOpts: "-Xmx1g -Xms1g"
resources:
requests:
cpu: 1
memory: 1Gi
limits:
cpu: 1
memory: 2Gi
securityContext:
runAsUser: null
sysctlInitContainer:
enabled: false
podSecurityContext:
fsGroup: null
runAsUser: null CuratorCurator does not work out of the box, with a similar reason. The config must be mounted at the very least with 0444. TipsI found it very useful when debugging networking and other stuff to have a apiVersion: v1
kind: Pod
metadata:
name: debug
spec:
containers:
- name: busybox
image: busybox:1.28
args:
- sleep
- "1000000" Which will start a long running ConclusionSo |
Jfyi we added option to configure the defaultMode of the configMap see for Zeebe https://github.com/camunda/camunda-platform-helm/blob/main/charts/camunda-platform/values.yaml#L142 And Gateway https://github.com/camunda/camunda-platform-helm/blob/main/charts/camunda-platform/values.yaml#L300 |
We would need to be able to configure the mode for all volume mounts, essentially (other than PVC claims, as this seems to be mounted properly already). |
Operate🎉 Operate works out of the box without any special configuration with Elastic/Zeebe. Though I wonder why Zeebe connects to |
Tasklist🎉 Tasklist works out of the box with Elastic/Zeebe, without any special configuration. |
IdentityIdentity works out of the box, but both Keycloak and Postgresql must be configured properly. identity:
enabled: true
keycloak:
containerSecurityContext:
runAsUser: null
podSecurityContext:
fsGroup: null
runAsUser: null
postgresql:
primary:
containerSecurityContext:
runAsUser: null
podSecurityContext:
fsGroup: null
runAsUser: null
readReplicas:
containerSecurityContext:
runAsUser: null
podSecurityContext:
fsGroup: null
runAsUser: null
metrics:
containerSecurityContext:
runAsUser: null
podSecurityContext:
fsGroup: null
runAsUser: null By the way, setting values to null does not work with Helm 3.2.0+. This is a longstanding bug, where you either need to use Helm 3.1.3 (which is what I'm using for now), or use a post-renderer. See helm/helm#9136 and helm/helm#5184 |
So, fun fact, I run out of memory if I try to run everything at once. Since OpenShift itself takes up already 9GiB, it doesn't leave that much room for the rest of the stack. If I reduce memory usage to the limit, then I start getting OOM kills 😄 Can't wait to get a real cluster in the ☁️ |
Optimize🎉 Runs fine out of the box, nothing special to configure. Complete deploymentI couldn't test a complete deployment because the resource requirements are just too high. If I reduce them to a minimum, then things start crashing 😄 Anyway, I deployed everything in the cloud with the following manifest (note, this is not the final one I would like to provide, but I adding it here for posterity): global:
image:
tag: 8.0.4
pullPolicy: Always
identity:
auth:
enabled: true
zeebe:
# Image configuration to configure the zeebe image specifics
image:
# Image.repository defines which image repository to use
repository: camunda/zeebe
# ClusterSize defines the amount of brokers (=replicas), which are deployed via helm
clusterSize: "1"
# PartitionCount defines how many zeebe partitions are set up in the cluster
partitionCount: "1"
# ReplicationFactor defines how each partition is replicated, the value defines the number of nodes
replicationFactor: "1"
# CpuThreadCount defines how many threads can be used for the processing on each broker pod
cpuThreadCount: 4
# IoThreadCount defines how many threads can be used for the exporting on each broker pod
ioThreadCount: 4
# do not run as root
containerSecurityContext:
runAsUser: null
configMap:
defaultMode: 0555
# JavaOpts can be used to set java options for the zeebe brokers
javaOpts: >-
-XX:MaxRAMPercentage=25.0
-XX:+ExitOnOutOfMemoryError
-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=/usr/local/zeebe/data
-XX:ErrorFile=/usr/local/zeebe/data/zeebe_error%p.log
-Xlog:gc*:file=/usr/local/zeebe/data/gc.log:time:filecount=7,filesize=8M
# Environment variables
env:
# Enable JSON logging for google cloud stackdriver
- name: ZEEBE_LOG_APPENDER
value: Stackdriver
- name: ZEEBE_LOG_STACKDRIVER_SERVICENAME
value: zeebe
- name: ZEEBE_LOG_STACKDRIVER_SERVICEVERSION
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: ZEEBE_BROKER_EXECUTION_METRICS_EXPORTER_ENABLED
value: "true"
- name: ATOMIX_LOG_LEVEL
value: INFO
- name: ZEEBE_LOG_LEVEL
value: DEBUG
- name: ZEEBE_BROKER_DATA_DISKUSAGECOMMANDWATERMARK
value: "0.8"
- name: ZEEBE_BROKER_DATA_DISKUSAGEREPLICATIONWATERMARK
value: "0.9"
- name: ZEEBE_BROKER_EXPERIMENTAL_CONSISTENCYCHECKS_ENABLEPRECONDITIONS
value: "true"
- name: ZEEBE_BROKER_EXPERIMENTAL_CONSISTENCYCHECKS_ENABLEFOREIGNKEYCHECKS
value: "true"
# Resources configuration to set request and limit configuration for the container https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#requests-and-limitsS
resources:
limits:
cpu: 2
memory: 2Gi
requests:
cpu: 1
memory: 1Gi
# PvcAccessModes can be used to configure the persistent volume claim access mode https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes
pvcAccessMode: ["ReadWriteOnce"]
# PvcSize defines the persistent volume claim size, which is used by each broker pod https://kubernetes.io/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims
pvcSize: 12Gi
# PvcStorageClassName can be used to set the storage class name which should be used by the persistent volume claim. It is recommended to use a storage class, which is backed with a SSD.
pvcStorageClassName: gp2
zeebe-gateway:
# Replicas defines how many standalone gateways are deployed
replicas: 1
# Image configuration to configure the zeebe-gateway image specifics
image:
# Image.repository defines which image repository to use
repository: camunda/zeebe
# LogLevel defines the log level which is used by the gateway
logLevel: debug
# Env can be used to set extra environment variables in each gateway container
env:
- name: ZEEBE_LOG_APPENDER
value: Stackdriver
- name: ZEEBE_LOG_STACKDRIVER_SERVICENAME
value: zeebe
- name: ZEEBE_LOG_STACKDRIVER_SERVICEVERSION
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: ATOMIX_LOG_LEVEL
value: INFO
- name: ZEEBE_LOG_LEVEL
value: DEBUG
- name: ZEEBE_GATEWAY_MONITORING_ENABLED
value: "true"
- name: ZEEBE_GATEWAY_THREADS_MANAGEMENTTHREADS
value: "1"
# Resources configuration to set request and limit configuration for the container https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#requests-and-limits
resources:
limits:
cpu: 1
memory: 1Gi
requests:
cpu: 1
memory: 512Mi
# do not run as root
containerSecurityContext:
runAsUser: null
# RetentionPolicy configuration to configure the elasticsearch index retention policies
retentionPolicy:
# RetentionPolicy.enabled if true, elasticsearch curator cronjob and configuration will be deployed.
enabled: true
# RetentionPolicy.schedule defines how often/when the curator should run
schedule: "*/15 * * * *"
# RetentionPolicy.zeebeIndexTTL defines after how many days a zeebe index can be deleted
zeebeIndexTTL: 1
# RetentionPolicy.zeebeIndexMaxSize can be set to configure the maximum allowed zeebe index size in gigabytes.
# After reaching that size, curator will delete that corresponding index on the next run.
# To benefit from that configuration the schedule needs to be configured small enough, like every 15 minutes.
zeebeIndexMaxSize: 10
# RetentionPolicy.operateIndexTTL defines after how many days an operate index can be deleted
operateIndexTTL: 30
# RetentionPolicy.tasklistIndexTTL defines after how many days a tasklist index can be deleted
tasklistIndexTTL: 30
operate:
enabled: true
image:
repository: camunda/operate
tag: 8.0.4
podSecurityContext:
runAsUser: null
resources:
limits:
memory: 2Gi
cpu: 1
requests:
memory: 1Gi
cpu: 1
tasklist:
enabled: true
image:
repository: camunda/tasklist
tag: 8.0.4
podSecurityContext:
runAsUser: null
resources:
limits:
memory: 2Gi
cpu: 1
requests:
cpu: 1
memory: 1Gi
identity:
enabled: true
resources:
limits:
cpu: 1
memory: 2Gi
requests:
cpu: 1
memory: 1Gi
keycloak:
resources:
limits:
cpu: 1
memory: 2Gi
requests:
cpu: 1
memory: 1Gi
containerSecurityContext:
runAsUser: null
podSecurityContext:
fsGroup: null
runAsUser: null
postgresql:
storageClass: "gp2"
primary:
resources:
requests:
cpu: 1
memory: 512Mi
limits:
cpu: 1
memory: 1Gi
persistence:
size: "2Gi"
containerSecurityContext:
runAsUser: null
podSecurityContext:
fsGroup: null
runAsUser: null
readReplicas:
persistence:
size: "2Gi"
containerSecurityContext:
runAsUser: null
podSecurityContext:
fsGroup: null
runAsUser: null
metrics:
enabled: false
# ELASTIC
elasticsearch:
enabled: true
imageTag: 7.16.2
replicas: 1
minimumMasterNodes: 1
volumeClaimTemplate:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "gp2"
resources:
requests:
storage: 32Gi
esJavaOpts: "-Xmx1g -Xms1g"
resources:
requests:
cpu: 1
memory: 2Gi
limits:
cpu: 1
memory: 3Gi
securityContext:
runAsUser: null
sysctlInitContainer:
enabled: false
podSecurityContext:
fsGroup: null
runAsUser: null
optimize:
enabled: true
image:
repository: camunda/optimize
tag: 3.8.0
partitionsCount: 1
resources:
requests:
cpu: 1
memory: 1Gi
limits:
cpu: 2
memory: 2Gi
# PrometheusServiceMonitor configuration for the prometheus service monitor
prometheusServiceMonitor:
# PrometheusServiceMonitor.enabled if true then a service monitor will be deployed, which allows a installed prometheus controller to scrape metrics from the broker pods
enabled: false Also you need Helm [3.0.0 - 3.1.3] to deploy this due to the bugs mentioned above. I'll add an example of how to use a post-renderer to work around this bug if you need to be on Helm 3.2.x or greater. |
With that said, I think we can close this. |
The goal of this issue is to collect in one place the issues that arise from installing the chart as is in an OKD/CRC cluster. OKD is the upstream, community version of OpenShift - think of it as what Fedora is to RHEL. CRC is their slimmed down, single node installation tool which lets you run OpenShift on your local development machine - think of it as kind is to Kubernetes.
You can find the CRC binaries here: https://www.okd.io/crc/ (no need for a RedHat account in this case, no matter what their documentation states).
I'll document the results of installing the chart here, which is the first step to understand what we need to do.
zeebe
andelastic
(see Investigate installing over OKD/CRC #342 (comment))zeebe-gateway
chart (see Investigate installing over OKD/CRC #342 (comment))curator
(see Investigate installing over OKD/CRC #342 (comment))operate
(see Investigate installing over OKD/CRC #342 (comment))tasklist
(see Investigate installing over OKD/CRC #342 (comment))identity
andkeycloak
andpostgresql
(see Investigate installing over OKD/CRC #342 (comment))optimize
(see Investigate installing over OKD/CRC #342 (comment))The text was updated successfully, but these errors were encountered: