ShinyProxy Monitoring

Summary

This repository provides all resources required for setting up comprehensive monitoring of ShinyProxy on Kubernetes. The setup uses Loki (together with promtail) for collecting logs of ShinyProxy, the ShinyProxy Operator and any app running in ShinyProxy. Prometheus is used for gathering metrics of ShinyProxy and the apps (i.e. the resources used by the apps). The setup also includes Grafana, together with six dashboards for visualizing all logs and metrics.

The retention of both Loki and Prometheus is set to 90 days.

Overview of dashboards

ShinyProxy Usage

Screenshot Screenshot (continued)

Datasource: Prometheus
Goal: provide inside in the current usage and performance of ShinyProxy.
Provided statistics:
- App Startup Time
- Fine-grained timings
  - Container schedule time (available on Kubernetes): the time needed for the container to get assigned to a node. Includes the time to launch a new node if the cluster needs to scale-up.
  - Image pull time (available on Docker and Kubernetes)
  - Container initialization time (available on Docker, Docker Swarm and Kubernetes): any time needed to start the app which is not part of the other metrics
  - Application Startup time (available on Docker, Docker swarm and Kubernetes): the time between the start the application process and when the app is reachable by ShinyProxy. For example, in the case of the Shiny app, this is the time the R process takes to startup and be reachable on its webserver.
- App Usage Time
- Number of running apps (per app name)
- Number of logged-in users
- Number of active users
- Number of auth failures
- Number of app start failures
- Number of app crashes

Note: the last three panels of this dashboard can somtimes show a too high value, e.g. the app crashes dashboard could list two app crashes while in reality only a single app crashed. This is caused by a limitation in Prometheus.

ShinyProxy Aggregated Usage

Screenshot

Datasource: Prometheus
Goal: provide inside in the long-term usage and performance of ShinyProxy.
Provided statistics:
- Number of times an app has been started (per app)
- The total time an app has been used (per app)
- The average time an app is used (per app)

ShinyProxy logs

Screenshot

Datasource: Loki
Goal: show the logs of the ShinyProxy server
Provided statistics:
- Number of warnings
- Number of errors

Note: This requires ShinyProxy to log using the JSON format.

ShinyProxy Operator Logs

Screenshot

Datasource: Loki
Goal: show the logs of the ShinyProxy Operator
Provided statistics:
- Number of warnings
- Number of errors

Note: promtail is configured such that it recognizes when Java outputs a stack trace and therefore collects this as a single log message. We could improve and optimize this by adding an option to the ShinyProxy Operator to log to JSON.

ShinyProxy App Logs

Screenshot Screenshot (error)

Datasource: Loki
Goal: show the logs of any app started by ShinyProxy.
Provided statistics:
- The time when the app was created. Combined with the filters, this should make it easier to find the correct container (id).

Note: this dashboard also works when apps are run in different namespaces than the namespace of the ShinyProxy server. As an example, the Dash application in ShinyProxy 1 runs in a different namespace.

Note: this dashboard also shows parts of the ShinyProxy log that are relevant for this app.

ShinyProxy App Resources

Screenshot

Datasource: Prometheus
Goal: show the resources (CPU, Memory, Network) used by any app started by ShinyProxy.
Provided statistics:
- Current CPU usage
- CPU Limit and Requests
- CPU Throttling
- Current memory usage
- Memory Limit and Requests
- Total network traffic (both transmitted and received)

Note: this dashboard also works when apps are run in different namespaces than the namespace of the ShinyProxy server. As an example, the Dash application in ShinyProxy 1 runs in a different namespace.

How it works

Loki + promtail

Both Loki and promtail are used to collect the logs for all relevant dashboards. The upstream Loki helm chart is used. No tweaks are needed to make it work with ShinyProxy (although parallel processing was disabled). In contrast, the configuration of Promtail must be changed to better work with ShinyProxy. See the overlays/promtail/configs/promtail.yaml file. Some important changes are:

in the kubernetes-pods scrape_config an extra section was added to not process any logs of ShinyProxy or the ShinyProxy operator. These logs are processed by the shinyproxy-and-operator-pods job.
```
- action: drop
  regex: '(shinyproxy)|(shinyproxy-operator)'
  source_labels:
    - app
```
the kubernetes-pods scrape_config was changed to process and store some ShinyProxy specific annotations. These annotations are added automatically to any app started by ShinyProxy. These must be stored in Loki, in order to be able to retrieve the logs of an app by specifying the name of the app.
the shinyproxy-and-operator-pods scrape_config was created in order to process logs of the ShinyProxy server and ShinyProxy Operator. The pipeline of this scrape_config ensures that multiline stacktraces are correctly processed as a single log line. Finally, the job also stores some labels relevant for ShinyProxy servers.

Prometheus

The Prometheus setup is based on the kube-prometheus stack.

Grafana

The following changes are made to the configuration of Grafana:

overlays/monitoring/configs/dashboards contains the Grafana dashboards.
overlays/monitoring/configs/datasources.yaml configures Prometheus and Loki as a datasource for Grafana.
overlays/monitoring/configs/grafana-dashboardSources.json creates a folder for the ShinyProxy dashboards.
overlays/monitoring/patches/grafana.deployment.yaml adapts the Grafana deployment to use the ShinyProxy dashboards. It also uses the latest version of the 8 series of Grafana and adds a PVC for storage.
overlays/monitoring/resources/grafana-storage.pvc.yaml creates a PVC for Grafana.
all dashboards are displayed in the timezone of the browser, show the data for the last 30 minutes and refresh every 10 seconds. In order to enforce this consistency a script is provided to normalize the dashboards.

Kube state metrics

Kube state metrics is an "Add-on agent to generate and expose cluster-level metrics." Information exported by this add-on is used in the ShinyProxy App Resources dashboard. The configuration (overlays/monitoring/patches/kubeStateMetrics-deployment.yaml) is changed to index additional annotations as labels into the kube_pod_annotations metric. This is then used by the dashboard to populate the variables (or filters) so that users can easily find their pod based on the name of the app etc.

Prometheus

The changes to the Prometheus config are:

overlays/monitoring/patches/prometheus-k8s.clusterrole.yaml gives Prometheus additional permissions, to view ServiceMonitor, Pod and Service resources on cluster level.
overlays/monitoring/resources/shinyproxy.servicemonitor.yaml setups a ServiceMonitor such that Prometheus collects the metrics of ShinyProxy. Note: this only collects the metrics of ShinyProxy servers running in the shinyproxy-operator namespace.

Getting started

This section demonstrates how to set up this stack in minikube.

Start minikube

minikube start --kubernetes-version='v1.22.11' --addons=metrics-server

Setup ingress (the ingress setup by minikube is too old and causes issues with its validating webhook):

helm upgrade --install ingress-nginx ingress-nginx  --repo https://kubernetes.github.io/ingress-nginx   --namespace ingress-nginx --create-namespace --set controller.hostPort.enabled=true

Configure web access to the cluster. First get the IP of minikube using:
```
minikube ip
```
Next, add the following entries to /etc/hosts, replacing MINIKUBE_IP by the output of the previous command;
```
MINIKUBE_IP       grafana.shinyproxy-demo.local
MINIKUBE_IP       operator-demo.local
```

Set up Loki

cd overlays/loki
kustomize build | kubectl apply --server-side -f - 
cd ../..

Note: re-run the command if it fails when it cannot find some CRDs.

Set up Promtail

cd overlays/promtail
kustomize build | kubectl apply --server-side -f - 
cd ../..

Set up Prometheus and Grafana
```
cd overlays/monitoring
kustomize build | kubectl apply --server-side -f - 
cd ../..
```
Note: re-run the command if it fails because it cannot find some CRDs.

Set up the demo ShinyProxy Operator deployment:

cd overlays/shinyproxy
kustomize build | kubectl apply --server-side -f - 
cd ../shinyproxy1-app
kustomize build | kubectl apply --server-side -f -

Note: re-run the command if it fails because it cannot find some CRDs.

You can now log in into shinyproxy on http://operator-demo.local/shinyproxy1 and http://operator-demo.local/shinyproxy2 with the users jack and jeff (both have as password password). You can log into grafana on http://grafana.shinyproxy-demo.local, with the username and password admin.

Upgrade to 3.1.0

This repository uses the same version numbers as ShinyProxy. Always use the same version of ShinyProxy and this repository.

In release 3.1.0 of this repository, all components were upgraded. In order to maintain your logs and metrics, it's important to take the following steps when updating:

edit line 50 of overlays/loki/configs/config.yaml: change the day to be one day after you upgrade Loki. E.g. if you update this on 2024-03-25 (25 March 2024), change the date to 2024-03-26. If you do not modify this line, you will no longer be able to access logs from before the upgrade. See the Loki docs for more information.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
.github/screenshots		.github/screenshots
bases		bases
overlays		overlays
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github/screenshots

.github/screenshots

bases

bases

overlays

overlays

LICENSE

LICENSE

README.md

README.md

Repository files navigation

ShinyProxy Monitoring

Summary

Overview of dashboards

ShinyProxy Usage

ShinyProxy Aggregated Usage

ShinyProxy logs

ShinyProxy Operator Logs

ShinyProxy App Logs

ShinyProxy App Resources

How it works

Loki + promtail

Prometheus

Grafana

Kube state metrics

Prometheus

Getting started

Upgrade to 3.1.0

About

Releases 2

Languages

License

openanalytics/shinyproxy-monitoring

Folders and files

Latest commit

History

Repository files navigation

ShinyProxy Monitoring

Summary

Overview of dashboards

ShinyProxy Usage

ShinyProxy Aggregated Usage

ShinyProxy logs

ShinyProxy Operator Logs

ShinyProxy App Logs

ShinyProxy App Resources

How it works

Loki + promtail

Prometheus

Grafana

Kube state metrics

Prometheus

Getting started

Upgrade to 3.1.0

About

Topics

Resources

License

Stars

Watchers

Forks

Languages