Module 02 / Project 04 -- File Processor CLI¶

Learn Your Way¶

Read	Build	Watch	Test	Review	Visualize	Try
—	This project	—	—	Flashcards	—	—

Focus¶

Building a practical CLI that does real work on files
Using rich.progress for progress bars
File globbing with pathlib.Path.glob()
Writing structured output to a report file

Why this project exists¶

The previous projects taught you Click's API in isolation. This project puts it all together into something you could actually use: a tool that scans a directory of text files, computes statistics (word count, longest line, average line length), and shows a progress bar while it works. It also introduces the Rich library, which is the standard for beautiful terminal output in Python.

Run¶

cd projects/modules/02-cli-tools/04-file-processor-cli

# Process the sample data files
python project.py --directory data

# Process and save a report
python project.py --directory data --output report.txt

# Process only .txt files (the default)
python project.py --directory data --pattern "*.txt"

Expected output¶

$ python project.py --directory data
Processing files...
 ━━━━━━━━━━━━━━━━━━━━ 100% 3/3

Results:
────────────────────────────────
sample1.txt
  Words: 52   Longest line: 68 chars   Avg line length: 38.2 chars

sample2.txt
  Words: 41   Longest line: 55 chars   Avg line length: 32.7 chars

sample3.txt
  Words: 67   Longest line: 72 chars   Avg line length: 41.5 chars

────────────────────────────────
Total files: 3
Total words: 160

(Exact numbers depend on the sample files.)

Alter it¶

Add a --sort option that sorts results by word count (ascending or descending).
Add a --min-words filter that only includes files with at least N words in the report.
Add a --verbose flag that also prints the first line of each file as a preview.

Break it¶

Point --directory at a folder that does not exist. What happens?
Put a binary file (like a .png) in the data folder and run the processor. What error do you get?
Remove the rich import and try to run the script. How does the error message help you diagnose the issue?

Fix it¶

Add a check for missing directories and print a helpful error before crashing.
Wrap the file-reading code in a try/except that skips binary or unreadable files with a warning.
Reinstall rich (pip install rich) and confirm the progress bar reappears.

Explain it¶

What does pathlib.Path.glob("*.txt") return and how is it different from os.listdir()?
How does rich.progress.track() know when to update the progress bar?
Why is it better to pass --directory as an option than to hardcode a path?
What would you change to make this tool handle thousands of files efficiently?

Mastery check¶

You can move on when you can: - use Path.glob() to find files matching a pattern, - add a Rich progress bar to any loop, - write a CLI that reads from a directory and writes a report, - handle missing or unreadable files without crashing.

Next¶

Continue to 05 - Typer Migration.