Level 4 / Project 11 - Audit Log Enhancer¶
Home: README
Learn Your Way¶
| Read | Build | Watch | Test | Review | Visualize | Try |
|---|---|---|---|---|---|---|
| — | This project | — | — | Flashcards | — | Browser |
Estimated time: 65 minutes
Focus¶
- rich event logging for traceability
Why this project exists¶
This project gives you level-appropriate practice in a realistic operations context. Goal: run the baseline, alter behavior, break one assumption, recover safely, and explain the fix.
Run (copy/paste)¶
Use <repo-root> as the folder containing this repository's README.md.
cd <repo-root>/projects/level-4/11-audit-log-enhancer
python project.py --input data/sample_input.jsonl --output data/enriched_logs.jsonl
pytest -q
Expected terminal output¶
{
"total_entries": 6,
"enriched": 6,
"severity_counts": {"LOW": 3, "HIGH": 1, "MEDIUM": 1, ...}
}
7 passed
Expected artifacts¶
data/enriched_logs.jsonl— enriched log entries (JSON lines)- Passing tests
- Updated
notes.md
Checkpoint: Baseline code runs and all tests pass. Commit your work before continuing.
Alter it (required) — Extension¶
- Add a
--severity-filterflag to only output entries at or above a given severity. - Add a
duration_categoryfield: "fast" (<100ms), "normal" (<1000ms), "slow" (>=1000ms). - Re-run script and tests — add a parametrized test for duration categories.
Break it (required) — Core¶
- Add a log entry with malformed JSON and confirm it is skipped gracefully.
- Add entries with no
session_idand observe how correlation IDs are assigned. - Use timestamps in different timezone formats and see if duration calculation handles them.
Fix it (required) — Core¶
- Handle timezone-naive timestamps by assuming UTC.
- Add a count of skipped malformed lines to the summary.
- Re-run until all tests pass.
Checkpoint: All modifications done, tests still pass. Good time to review your changes.
Explain it (teach-back)¶
- What is a correlation ID and why is it useful for debugging?
- Why does
enrich_entrymake a shallow copy instead of modifying the original dict? - What is the JSON Lines format and why is it preferred over a JSON array for logs?
- How would you handle enrichment of millions of log entries efficiently?
Mastery check¶
You can move on when you can: - run baseline without docs, - explain one core function line-by-line, - break and recover in one session, - keep tests passing after your change.
Related Concepts¶
| ← Prev | Home | Next → |
|---|---|---|