A curated list of Site Reliability and Production Engineering resources.
-
Updated
Dec 3, 2023
Site reliability engineering (SRE) is a set of principles and practices that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems. Site reliability engineering is closely related to DevOps, a set of practices that combine software development and IT operations, and SRE has also been described as a specific implementation of DevOps.
A curated list of Site Reliability and Production Engineering resources.
An easy to use and powerful chaos engineering experiment toolkit.(阿里巴巴开源的一款简单易用、功能强大的混沌实验注入工具)
A Chaos Engineering Platform for Kubernetes.
A curated collection of publicly available resources on how technology and tech-savvy organizations around the world practice Site Reliability Engineering (SRE)
Litmus helps SREs and developers practice chaos engineering in a Cloud-native way. Chaos experiments are published at the ChaosHub (https://hub.litmuschaos.io). Community notes is at https://hackmd.io/a4Zu_sH4TZGeih-xCimi3Q
A curated list of Chaos Engineering resources.
Web UI for Jaeger
A collection of postmortem templates
Chaos testing, network emulation, and stress testing tool for containers
This repository includes resources which are more than sufficient to prepare for google interview if you are applying for a software engineer position or a site reliability engineer position
A curated list of Site Reliability and Production Engineering Tools
Curated list of good SRE interview questions.
Google Site Reliability Engineering book converted in audio
A chaos engineering platform for supporting the complete fault drill lifecycle.
A role-playing game for incident management training
Devopness - Painless essential DevOps to everyone
This repository helps performance testers and engineers who wants to dive into DevOps and SRE world.
OpenShift Guide. Learn about the Red Hat OpenShift Container Platform, Data Science, Code Ready Containers, Podman, Buildah, and Kubernetes.
What to Read to Learn More About DevOps