Collections Deep Dive — Part 1: defaultdict, Counter, OrderedDict¶
← Back to Overview · Part 2: deque, namedtuple, ChainMap →
Learn Your Way¶
| Read | Build | Watch | Test | Review | Visualize |
|---|---|---|---|---|---|
| You are here | Projects | — | — | Flashcards | — |
Python's collections module provides specialized container types that go beyond the built-in list, dict, and set. This part covers the three dict-like types: Counter, defaultdict, and OrderedDict.
Why This Matters¶
Every program needs to store and organize data. The built-in types handle most cases, but they have gaps. Need to count how often each word appears? Counter. Need a dict that automatically handles missing keys? defaultdict. Learning these tools saves you from writing (and debugging) boilerplate code.
Counter — count things¶
The most intuitive way to count occurrences:
from collections import Counter
# Count letters in a string:
letter_counts = Counter("mississippi")
print(letter_counts)
# Counter({'s': 4, 'i': 4, 'p': 2, 'm': 1})
# Count words in a list:
words = ["apple", "banana", "apple", "cherry", "banana", "apple"]
word_counts = Counter(words)
print(word_counts)
# Counter({'apple': 3, 'banana': 2, 'cherry': 1})
# Most common items:
word_counts.most_common(2)
# [('apple', 3), ('banana', 2)]
Counter supports math operations:
a = Counter("aabbcc")
b = Counter("aabbd")
a + b # Counter({'a': 4, 'b': 4, 'c': 2, 'd': 1})
a - b # Counter({'c': 2}) — only positive counts
a & b # Counter({'a': 2, 'b': 2}) — minimum of each
a | b # Counter({'a': 2, 'b': 2, 'c': 2, 'd': 1}) — maximum of each
defaultdict — dicts with automatic defaults¶
A defaultdict never raises KeyError — it creates a default value automatically for missing keys:
from collections import defaultdict
# Group items by category:
animals = [("cat", "Felix"), ("dog", "Rex"), ("cat", "Whiskers"), ("dog", "Buddy")]
groups = defaultdict(list) # Missing keys get an empty list
for category, name in animals:
groups[category].append(name)
print(groups)
# defaultdict(<class 'list'>, {'cat': ['Felix', 'Whiskers'], 'dog': ['Rex', 'Buddy']})
Compare with regular dict:
# Without defaultdict — verbose:
groups = {}
for category, name in animals:
if category not in groups:
groups[category] = []
groups[category].append(name)
# With defaultdict — clean:
groups = defaultdict(list)
for category, name in animals:
groups[category].append(name)
Common default factories:
defaultdict(list) # Missing keys → empty list []
defaultdict(int) # Missing keys → 0
defaultdict(set) # Missing keys → empty set set()
defaultdict(str) # Missing keys → empty string ""
defaultdict(dict) # Missing keys → empty dict {}
Counting with defaultdict(int):
word_count = defaultdict(int)
for word in "the cat sat on the mat".split():
word_count[word] += 1
# {'the': 2, 'cat': 1, 'sat': 1, 'on': 1, 'mat': 1}
OrderedDict — dict that remembers insertion order¶
Since Python 3.7, regular dicts maintain insertion order. So when is OrderedDict still useful?
from collections import OrderedDict
# OrderedDict considers order in equality checks:
d1 = OrderedDict([("a", 1), ("b", 2)])
d2 = OrderedDict([("b", 2), ("a", 1)])
d1 == d2 # False — different order!
# Regular dicts do not:
{"a": 1, "b": 2} == {"b": 2, "a": 1} # True
# OrderedDict has move_to_end:
od = OrderedDict([("a", 1), ("b", 2), ("c", 3)])
od.move_to_end("a") # a moves to end: OrderedDict([('b', 2), ('c', 3), ('a', 1)])
od.move_to_end("c", last=False) # c moves to start: OrderedDict([('c', 3), ('b', 2), ('a', 1)])
Use OrderedDict when order matters for equality comparison or when you need move_to_end(). Otherwise, use a regular dict.
Common Mistakes¶
Forgetting that defaultdict creates entries on access:
d = defaultdict(list)
if d["missing_key"]: # This CREATES the key with an empty list!
pass
# Use "key in d" to check without creating:
if "missing_key" in d:
pass
Using Counter with non-hashable items:
# Lists are not hashable:
Counter([[1, 2], [3, 4]]) # TypeError!
# Convert to tuples first:
Counter([(1, 2), (3, 4)]) # OK
| ← Overview | Part 2: deque, namedtuple, ChainMap → |
|---|---|