Level 4 / Project 11 - Audit Log Enhancer¶

Learn Your Way¶

Read	Build	Watch	Test	Review	Visualize	Try
—	This project	—	—	Flashcards	—	Browser

Estimated time: 65 minutes

Focus¶

rich event logging for traceability

Why this project exists¶

This project gives you level-appropriate practice in a realistic operations context. Goal: run the baseline, alter behavior, break one assumption, recover safely, and explain the fix.

Run (copy/paste)¶

Use <repo-root> as the folder containing this repository's README.md.

cd <repo-root>/projects/level-4/11-audit-log-enhancer
python project.py --input data/sample_input.jsonl --output data/enriched_logs.jsonl
pytest -q

Expected terminal output¶

{
  "total_entries": 6,
  "enriched": 6,
  "severity_counts": {"LOW": 3, "HIGH": 1, "MEDIUM": 1, ...}
}
7 passed

Expected artifacts¶

data/enriched_logs.jsonl — enriched log entries (JSON lines)
Passing tests
Updated notes.md

Checkpoint: Baseline code runs and all tests pass. Commit your work before continuing.

Alter it (required) — Extension¶

Add a --severity-filter flag to only output entries at or above a given severity.
Add a duration_category field: "fast" (<100ms), "normal" (<1000ms), "slow" (>=1000ms).
Re-run script and tests — add a parametrized test for duration categories.

Break it (required) — Core¶

Add a log entry with malformed JSON and confirm it is skipped gracefully.
Add entries with no session_id and observe how correlation IDs are assigned.
Use timestamps in different timezone formats and see if duration calculation handles them.

Fix it (required) — Core¶

Handle timezone-naive timestamps by assuming UTC.
Add a count of skipped malformed lines to the summary.
Re-run until all tests pass.

Checkpoint: All modifications done, tests still pass. Good time to review your changes.

Explain it (teach-back)¶

What is a correlation ID and why is it useful for debugging?
Why does enrich_entry make a shallow copy instead of modifying the original dict?
What is the JSON Lines format and why is it preferred over a JSON array for logs?
How would you handle enrichment of millions of log entries efficiently?

Mastery check¶

You can move on when you can: - run baseline without docs, - explain one core function line-by-line, - break and recover in one session, - keep tests passing after your change.

← Prev	Home	Next →