Skip to content

Generators and Iterators — Part 2: Generators

← Part 1: Iterators · Back to Overview

Learn Your Way

Read Build Watch Test Review Visualize
You are here Projects Flashcards

A generator is a function that produces values one at a time using yield instead of return. Generators are the easy way to create iterators — they handle the protocol automatically and let you write lazy, memory-efficient code.

Generators — the easy way to make iterators

Writing a class with __iter__ and __next__ is tedious. A generator function does the same thing with much less code:

def count_up(n):
    """Yield numbers from 1 to n."""
    i = 1
    while i <= n:
        yield i     # Pause here, produce a value
        i += 1      # Resume here on next call

# Usage:
for num in count_up(5):
    print(num)      # 1, 2, 3, 4, 5

When Python hits yield, the function pauses and gives back the value. Next time you ask for a value, it resumes right where it left off.

A generator function does not run when you call it — it returns a generator object:

gen = count_up(3)       # Nothing runs yet
print(type(gen))        # <class 'generator'>
print(next(gen))        # 1 — now it runs until the first yield
print(next(gen))        # 2
print(next(gen))        # 3

Generator expressions — one-liners

Just like list comprehensions create lists, generator expressions create generators:

# List comprehension — builds the entire list in memory
squares_list = [x**2 for x in range(1_000_000)]    # Uses ~8 MB

# Generator expression — computes one value at a time
squares_gen = (x**2 for x in range(1_000_000))      # Uses ~100 bytes

The only syntax difference is () instead of []. Use generator expressions when you only need to iterate once and do not need to index or re-use the data.

# Common pattern: pass directly to a function
total = sum(x**2 for x in range(100))    # No extra brackets needed

Chaining generators (pipelines)

Generators compose beautifully into processing pipelines:

def read_lines(path):
    with open(path) as f:
        for line in f:
            yield line.strip()

def filter_errors(lines):
    for line in lines:
        if "ERROR" in line:
            yield line

def extract_timestamp(lines):
    for line in lines:
        yield line.split(" ")[0]    # First word is the timestamp

# Chain them together — no intermediate lists:
lines = read_lines("server.log")
errors = filter_errors(lines)
timestamps = extract_timestamp(errors)

for ts in timestamps:
    print(ts)

Each generator pulls one value at a time from the previous one. The entire file is never loaded into memory.

yield from — delegating to another generator

When one generator needs to yield all values from another, use yield from:

def count_up(n):
    for i in range(1, n + 1):
        yield i

def count_down(n):
    for i in range(n, 0, -1):
        yield i

def up_and_down(n):
    yield from count_up(n)       # Yields 1, 2, ..., n
    yield from count_down(n)     # Then n, n-1, ..., 1


list(up_and_down(3))    # [1, 2, 3, 3, 2, 1]

Without yield from, you would need a loop: for x in count_up(n): yield x.

send() and throw() — advanced two-way communication

Generators can receive values, not just produce them:

def running_average():
    total = 0
    count = 0
    average = None
    while True:
        value = yield average    # Receive a value, send back the average
        total += value
        count += 1
        average = total / count


avg = running_average()
next(avg)              # Prime the generator (advance to first yield)
print(avg.send(10))    # 10.0
print(avg.send(20))    # 15.0
print(avg.send(30))    # 20.0

This is an advanced pattern. You will not need it often, but it powers frameworks like asyncio under the hood.

Common Mistakes

Forgetting that generators are lazy:

# This does NOT print anything yet:
gen = (print(x) for x in range(5))

# Only prints when you consume it:
list(gen)    # NOW it prints 0, 1, 2, 3, 4


← Part 1: Iterators Overview