Level 9 / Project 06 - Reliability Scorecard¶

Learn Your Way¶

Read	Build	Watch	Test	Review	Visualize	Try
—	This project	—	—	Flashcards	—	—

Focus¶

Weighted multi-criteria scoring across reliability dimensions
Score normalization to a 0-100 scale for different metric types
Letter grading (A/B/C/D/F) for non-technical stakeholder communication
Trend analysis comparing current vs previous scoring periods
Strategy pattern for dimension-specific scoring functions

Why this project exists¶

Reliability is multi-dimensional: uptime, mean time to recovery (MTTR), change failure rate, deployment frequency, and incident response all contribute. A service with 99.99% uptime but 8-hour MTTR is not truly reliable. This project builds a weighted scorecard that evaluates reliability across these dimensions, normalizes different units to a comparable scale, assigns letter grades, and generates improvement recommendations — the same framework SRE teams use to compare service reliability across an organization.

Run (copy/paste)¶

cd <repo-root>/projects/level-9/06-reliability-scorecard
python project.py --demo
pytest -q

Expected terminal output¶

{
  "service": "checkout-api",
  "overall_score": 78.5,
  "grade": "B",
  "dimensions": [...],
  "recommendations": [...]
}
7 passed

Expected artifacts¶

Console JSON output with reliability scorecard
Passing tests
Updated notes.md

Alter it (required)¶

Add a trend field to DimensionScore that compares current vs previous period scores.
Add a normalizer for boolean metrics (e.g. "has runbook" -> 100 or 0).
Add a --compare flag that shows side-by-side scores for two services.

Break it (required)¶

Set all dimension weights to 0 — does the weighted score calculation handle it?
Pass a raw value outside the normalizer's expected range (e.g. negative uptime) — what score results?
Create a scorecard with zero dimensions — does grading still work?

Fix it (required)¶

Validate that total weight is > 0 before computing the weighted average.
Clamp normalized values to the 0-100 range.
Return a default "unscored" grade when there are no dimensions.

Explain it (teach-back)¶

What is a reliability scorecard and how do SRE teams use them?
How does weighted scoring prioritize some dimensions over others?
Why is score normalization needed — what problem does it solve?
How do letter grades (A/B/C/D/F) help communicate reliability status to non-technical stakeholders?

Mastery check¶

You can move on when you can: - explain weighted scoring and why different reliability dimensions have different weights, - add a new dimension (e.g. "deployment frequency") with its own normalizer, - describe how normalization converts different units to a comparable 0-100 scale, - design a reliability scorecard for a real service with appropriate dimensions and weights.

← Prev	Home	Next →