Here follows some sample SRE dashboards in Grafana.
http://demo.robustperception.io:9090/consoles/index.html
https://www.atlassian.com/br/incident-management/devops/sre
https://landing.google.com/sre/workbook/chapters/slo-engineering-case-studies/#the-valet-dashboard
What does the ideal SRE dashboard look like? Make sure it has these KPIs:
SLO violation duration graph, response time (99th percentile) and load for your critical API calls
Error rate
Database response time
End-user response time (99th percentile)
Requests per minute
Availability
Session duration
https://www.appdynamics.com/blog/product/software-reliability-metrics/
Please, follow our social networks:
Thank You and until the next one! 😉👍
Published on Jun 09, 2020 by Vinicius Moll