SRE
Site reliability engineering (SRE) is a set of principles and practices that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems. Site reliability engineering is closely related to DevOps, a set of practices that combine software development and IT operations, and SRE has also been described as a specific implementation of DevOps.
Here are 654 public repositories matching this topic...
Terraform provider for Nobl9
-
Updated
Jun 2, 2024 - Go
Create, share, and run runbooks from your terminal.
-
Updated
Jun 2, 2024 - Go
The Open Source DevOps Assistant - solve problems twice as fast with an AI teammate
-
Updated
Jun 2, 2024 - Python
Terraform Pull Request Automation
-
Updated
Jun 2, 2024 - Go
An active monitoring software to detect failures before your customers do.
-
Updated
Jun 2, 2024 - Go
Kaytu's AI platform boosts cloud efficiency by analyzing historical usage and delivering intelligent recommendations—such as optimizing instance sizes—that maintain reliability. Pay for what you need, without compromising your apps.
-
Updated
Jun 2, 2024 - Go
A prometheus exporter for pg-promise
-
Updated
Jun 1, 2024 - TypeScript
A prometheus exporter exposing metrics for KafkaJS
-
Updated
Jun 1, 2024 - TypeScript
A prometheus exporter for node-postgres
-
Updated
Jun 1, 2024 - TypeScript
DevOps Tutorials
-
Updated
Jun 1, 2024 - HCL
Website, courses documentation, blog and youtube video tracker.
-
Updated
Jun 2, 2024 - HTML
Enable Self-Service Operations: Give specific users access to your existing tools, services, and scripts
-
Updated
May 31, 2024 - Groovy
A MVP of a platform for delivering reliable applications on Google Cloud
-
Updated
May 31, 2024 - HCL
Curated Self Study Guide for Computer Science, DevOps, SRE & SysAdmin
-
Updated
May 31, 2024 - HTML
Cloud-ops automation runbooks that are ready to use. Build your own automations using the hundreds of drag and drop actions included in the repository. Built on Jupyter Notebooks, our automation platform jumpstarts your SRE RunBook creation. 😎 published by the unSkript community.
-
Updated
May 31, 2024 - Jupyter Notebook
Terraform Quickstart Templates
-
Updated
May 31, 2024 - HCL
StackStorm (aka "IFTTT for Ops") is event-driven automation for auto-remediation, incident responses, troubleshooting, deployments, and more for DevOps and SREs. Includes rules engine, workflow, 160 integration packs with 6000+ actions (see https://exchange.stackstorm.org) and ChatOps. Installer at https://docs.stackstorm.com/install/index.html
-
Updated
Jun 1, 2024 - Python
- Followers
- 114 followers
- Wikipedia
- Wikipedia