Level 1 / Project 15 - Level 1 Mini Automation¶
Home: README
Learn Your Way¶
| Read | Build | Watch | Test | Review | Visualize | Try |
|---|---|---|---|---|---|---|
| Concept | This project | Walkthrough | Quiz | Flashcards | — | Browser |
Estimated time: 40 minutes
Focus¶
- multi-step beginner automation flow
Why this project exists¶
Build a multi-step data pipeline that parses, validates, transforms, filters, and summarises records. This Level 1 capstone combines everything you have learned into a realistic ETL (Extract-Transform-Load) workflow.
Run (copy/paste)¶
Use <repo-root> as the folder containing this repository's README.md.
cd <repo-root>/projects/level-1/15-level1-mini-automation
python project.py --input data/sample_input.txt
pytest -q
Expected terminal output¶
=== Automation Pipeline ===
Step 1: Parsed 3 records
Step 2: Validated 3 records (0 errors)
Step 3: Transformed values
Step 4: Filtered to 2 active records
Step 5: Summary -- total value: $350.00
Output written to data/output.json
8 passed
Expected artifacts¶
data/output.json- Passing tests
- Updated
notes.md
Checkpoint: Baseline code runs and all tests pass. Commit your work before continuing.
Alter it (required) — Extension¶
- Add a Step 6:
step_export_csv()that writes active records to a CSV file. - Add a
--verboseflag that prints the result of each pipeline step as it executes. - Re-run script and tests.
Break it (required) — Core¶
- Add a line with only 2 pipe-separated values -- does
step_parse_records()skip it or crash? - Add a record with a non-numeric value field -- does
step_transform()use the default 0.0? - Use a file where all records have status "failed" -- does
step_summarise()handle an empty active list?
Fix it (required) — Core¶
- Ensure
step_parse_records()skips lines with fewer than 3 pipe-separated fields. - Handle the all-filtered-out case in
step_summarise()by returning zero counts. - Add a test for the non-numeric-value fallback.
Checkpoint: All modifications done, tests still pass. Good time to review your changes.
Explain it (teach-back)¶
- Why is the pipeline split into 5 separate
step_*functions instead of one big function? - What is the "pipeline pattern" and why does each step take input and return output?
- Why does
run_pipeline()track counts at each step (total_lines, parsed, active)? - Where would multi-step pipelines appear in real software (ETL systems, CI/CD, data processing)?
Mastery check¶
You can move on when you can: - run baseline without docs, - explain one core function line-by-line, - break and recover in one session, - keep tests passing after your change.
Related Concepts¶
| ← Prev | Home | Next → |
|---|---|---|