Solution: Elite Track / Staff Engineer Capstone¶
STOP -- Have you attempted this project yourself first?
Learning happens in the struggle, not in reading answers. Spend at least 20 minutes trying before reading this solution. Check the README for requirements and the Walkthrough for guided hints.
Complete solution¶
"""Staff Engineer Capstone.
This project is part of the elite extension track.
It intentionally emphasizes explicit, testable engineering decisions.
"""
# WHY a staff engineer capstone? -- Staff+ engineering is about system-level
# thinking: making decisions that affect multiple teams, balancing technical
# debt against feature velocity, and building platforms that scale organizational
# output. This capstone integrates all prior elite track concepts into a
# single system design exercise.
from __future__ import annotations
import argparse
import json
from datetime import datetime, timezone
from pathlib import Path
from typing import Any
def parse_args() -> argparse.Namespace:
"""Parse CLI inputs for deterministic project execution."""
parser = argparse.ArgumentParser(description="Staff Engineer Capstone")
parser.add_argument("--input", required=True, help="Path to input text data")
parser.add_argument("--output", required=True, help="Path to output JSON summary")
parser.add_argument("--run-id", default="manual-run", help="Optional run identifier")
return parser.parse_args()
def load_lines(input_path: Path) -> list[str]:
"""Load normalized input lines and reject empty datasets safely."""
if not input_path.exists():
raise FileNotFoundError(f"input file not found: {input_path}")
lines = [line.strip() for line in input_path.read_text(encoding="utf-8").splitlines() if line.strip()]
if not lines:
raise ValueError("input file contains no usable lines")
return lines
def classify_line(line: str) -> dict[str, Any]:
"""Transform one CSV-like line into structured fields with validation."""
parts = [piece.strip() for piece in line.split(",")]
if len(parts) != 3:
raise ValueError(f"invalid line format (expected 3 comma fields): {line}")
name, score_raw, severity = parts
score = int(score_raw)
return {
"name": name,
"score": score,
"severity": severity,
# WHY is_high_risk for system design? -- In a staff engineer context,
# "warn" and "critical" map to architectural risks: high coupling,
# single points of failure, scalability bottlenecks. Low scores
# indicate areas where system-level decisions need to be made.
"is_high_risk": severity in {"warn", "critical"} or score < 5,
}
def build_summary(records: list[dict[str, Any]], project_title: str, run_id: str) -> dict[str, Any]:
"""Build deterministic summary payload for testing and teach-back review."""
high_risk_count = sum(1 for record in records if record["is_high_risk"])
avg_score = round(sum(record["score"] for record in records) / len(records), 2)
return {
"project_title": project_title,
"run_id": run_id,
"generated_utc": datetime.now(timezone.utc).isoformat(),
"record_count": len(records),
"high_risk_count": high_risk_count,
"average_score": avg_score,
"records": records,
}
def write_summary(output_path: Path, payload: dict[str, Any]) -> None:
"""Write JSON output with parent directory creation for first-time runs."""
output_path.parent.mkdir(parents=True, exist_ok=True)
output_path.write_text(json.dumps(payload, indent=2), encoding="utf-8")
def main() -> int:
"""Execute end-to-end project run."""
args = parse_args()
input_path = Path(args.input)
output_path = Path(args.output)
lines = load_lines(input_path)
records = [classify_line(line) for line in lines]
payload = build_summary(records, "Staff Engineer Capstone", args.run_id)
write_summary(output_path, payload)
print(f"output_summary.json written to {output_path}")
return 0
if __name__ == "__main__":
raise SystemExit(main())
Design decisions¶
| Decision | Why | Alternative considered |
|---|---|---|
| Integrative capstone that touches all prior domains | Staff engineers must synthesize across security, reliability, performance, and operations | Deep-dive into a single domain -- does not exercise the cross-cutting system-level thinking |
| Deterministic pipeline as the foundation | Every concept from the elite track (algorithms, concurrency, caching, auth, profiling, events, SLOs, compliance, OSS) can be expressed through this pipeline | Free-form design exercise -- harder to evaluate and compare across learners |
| Risk scoring across domains | System-level thinking requires seeing risk holistically; a single high-risk item in security affects the entire system | Per-domain scoring -- more detailed but misses cross-domain interactions |
| JSON evidence for decision documentation | Staff engineers must document and defend their decisions; structured output enables review and critique | Presentation slides -- less rigorous, harder to version-control |
Alternative approaches¶
Approach B: Architecture Decision Record (ADR) generator¶
from dataclasses import dataclass
from datetime import date
@dataclass
class ADR:
"""Architecture Decision Record -- documents a key technical decision."""
number: int
title: str
status: str # "proposed", "accepted", "deprecated", "superseded"
context: str # What is the issue that motivated this decision?
decision: str # What is the change that is being proposed?
consequences: list[str] # What becomes easier or harder because of this decision?
date: str = ""
def __post_init__(self):
if not self.date:
self.date = date.today().isoformat()
def to_markdown(self) -> str:
consequences = "\n".join(f"- {c}" for c in self.consequences)
return f"""# ADR-{self.number:04d}: {self.title}
**Status:** {self.status}
**Date:** {self.date}
## Context
{self.context}
## Decision
{self.decision}
## Consequences
{consequences}
"""
Trade-off: ADR generation is a core staff engineer skill -- documenting decisions with context, alternatives, and consequences ensures institutional knowledge survives team changes. However, it is a documentation exercise rather than a systems engineering exercise. The pipeline capstone tests both the technical foundation (data processing, validation, output) and the ability to reason about system-level tradeoffs through the "Explain it" section.
Common pitfalls¶
| Scenario | What happens | Prevention |
|---|---|---|
| Optimizing for a single quality attribute (e.g., performance) | Other attributes degrade (reliability, security, maintainability); the system becomes fragile | Use architecture fitness functions to track multiple attributes simultaneously |
| Making decisions without documenting context | When the original team leaves, the next team does not understand why decisions were made and reverses them | Write ADRs for every significant technical decision; store them in the repository |
| Scope creep in system design | Trying to solve every problem at once leads to an unshippable monolith | Define clear boundaries and milestones; ship incrementally and validate at each stage |