Curated Self Study Guide for Computer Science, DevOps, SRE & SysAdmin
-
Updated
Jun 2, 2024 - HTML
Site reliability engineering (SRE) is a set of principles and practices that incorporates aspects of software engineering and applies them to infrastructure and operations problems. The main goals are to create scalable and highly reliable software systems. Site reliability engineering is closely related to DevOps, a set of practices that combine software development and IT operations, and SRE has also been described as a specific implementation of DevOps.
Curated Self Study Guide for Computer Science, DevOps, SRE & SysAdmin
Terraform provider for Nobl9
Terraform Pull Request Automation
Create, share, and run runbooks from your terminal.
The Open Source DevOps Assistant - solve problems twice as fast with an AI teammate
An active monitoring software to detect failures before your customers do.
Kaytu's AI platform boosts cloud efficiency by analyzing historical usage and delivering intelligent recommendations—such as optimizing instance sizes—that maintain reliability. Pay for what you need, without compromising your apps.
A prometheus exporter for pg-promise
A prometheus exporter exposing metrics for KafkaJS
A prometheus exporter for node-postgres
DevOps Tutorials
Website, courses documentation, blog and youtube video tracker.
Enable Self-Service Operations: Give specific users access to your existing tools, services, and scripts
A MVP of a platform for delivering reliable applications on Google Cloud
Cloud-ops automation runbooks that are ready to use. Build your own automations using the hundreds of drag and drop actions included in the repository. Built on Jupyter Notebooks, our automation platform jumpstarts your SRE RunBook creation. 😎 published by the unSkript community.
Terraform Quickstart Templates