Level 4 / Project 12 - Checkpoint Recovery Tool¶
Home: README
Learn Your Way¶
| Read | Build | Watch | Test | Review | Visualize | Try |
|---|---|---|---|---|---|---|
| — | This project | — | — | Flashcards | — | Browser |
Estimated time: 70 minutes
Focus¶
- resume from checkpoints safely
Why this project exists¶
This project gives you level-appropriate practice in a realistic operations context. Goal: run the baseline, alter behavior, break one assumption, recover safely, and explain the fix.
Run (copy/paste)¶
Use <repo-root> as the folder containing this repository's README.md.
cd <repo-root>/projects/level-4/12-checkpoint-recovery-tool
python project.py --input data/sample_input.txt --output data/processed_output.json --checkpoint data/.checkpoint.json --batch-size 3
pytest -q
Expected terminal output¶
Expected artifacts¶
data/processed_output.json— processed resultsdata/.checkpoint.json— checkpoint file (cleared on success)- Passing tests
- Updated
notes.md
Checkpoint: Baseline code runs and all tests pass. Commit your work before continuing.
Alter it (required) — Extension¶
- Add a
--simulate-crashflag that stops after N items to test recovery. - Add a progress bar (percentage) logged at each checkpoint.
- Re-run script and tests — verify crash simulation creates a valid checkpoint.
Break it (required) — Core¶
- Corrupt the checkpoint file (write invalid JSON) and run — observe the "starting fresh" behavior.
- Modify
process_itemto raise an exception on a specific item — verify the checkpoint has progress up to the failure point. - Set
--batch-size 0and observe what happens.
Fix it (required) — Core¶
- Validate
batch_sizeis positive inparse_args. - Add error handling in
process_itemso one bad item does not crash the whole batch. - Re-run until all tests pass.
Checkpoint: All modifications done, tests still pass. Good time to review your changes.
Explain it (teach-back)¶
- Why does
save_checkpointwrite to a.tmpfile first then rename? - What would happen if the process crashed DURING a checkpoint write?
- Why is the checkpoint cleared after successful completion?
- How would you extend this to support parallel processing with checkpoints?
Mastery check¶
You can move on when you can: - run baseline without docs, - explain one core function line-by-line, - break and recover in one session, - keep tests passing after your change.
Related Concepts¶
| ← Prev | Home | Next → |
|---|---|---|