A curated list of Site Reliability and Production Engineering resources.
-
Updated
Dec 3, 2023
A curated list of Site Reliability and Production Engineering resources.
A simple, zero-dependency, pure js/html status page based on GitHub Pages and Actions.
C++ implementation of Raft core logic as a replication library
A curated list of Site Reliability and Production Engineering Tools
CORTX ha (High-Availability) is responsible for ensuring that CORTX Solution is available in case of any hardware component or software service failures. It takes care of failover/ failback control flow for affected services and stabilizes them across CORTX cluster.
The tool to check the availability or syntax of domain, IP or URL.
Notes on Site Reliability Engineering. Leave a 🌟 if you found this useful!
☕️ Grab a slick name for your new project
Monitore your websites availability, http status code (current and history), certificate, redirects and more with Grafana and Prometheus blackbox exporter.
Hermes: a fault-tolerant replication protocol, implemented over RDMA, guaranteeing linearizability and achieving low latency and high throughput.
Automatic repair for unhealthy Kubernetes nodes
后台架构,性能,安全,高可用,高扩展,数据分片等案例
Kubernetes Operator to manage node maintenance through NodeMaintenance custom resources
Calculate how much downtime should be permitted in your Service Level Agreement or Objective
Availability management backend and API for Sharetribe marketplaces
Website Availability Monitor: add your website to our dashboard and get 24x7 monitoring of its availability (and a badge!)
[ARCHIVED] Please report to https://github.com/funilrys/PyFunceble.
Add a description, image, and links to the availability topic page so that developers can more easily learn about it.
To associate your repository with the availability topic, visit your repo's landing page and select "manage topics."