Data Structures — Lists, Tuples, Dicts, Sets & Comprehensions
How Python Stores Collections
Every time you store a group of values, you make a performance and semantic decision. Python's four built-in containers — list, tuple, dict, and set — each have distinct internal implementations, time complexities, and intended purposes.
Choosing the wrong container does not just slow your code — it obscures your intent and leads to bugs. This lesson gives you a complete mental model of each.
Lists — Dynamic Arrays of References
A list is a mutable, ordered, dynamically resized array of object references. CPython stores an array of pointers to PyObject structs. This means:
- Index access → O(1) (direct pointer arithmetic)
- Append → O(1) amortised (over-allocates to avoid frequent resizing)
- Insert at front → O(n) (all pointers shift)
- Search (
in) → O(n) (linear scan)
Creating and Indexing
Every Mutation Method — Explained and Demonstrated
Sorting — Deep Dive with key=
Python uses Timsort — O(n log n) worst case, O(n) on already-sorted data, and stable (equal elements keep their original relative order).
List as Stack and Queue — When to Use Each
Shallow vs Deep Copy
Tuples — Immutable, Hashable Sequences
A tuple is an immutable, ordered sequence. Because it cannot change, Python stores it more compactly than a list and it can be used as a dictionary key or set element.
Named Tuples — Self-Documenting Records
Dictionaries — Hash Tables with Ordering
A dict is an ordered (Python 3.7+), mutable hash table. Keys are hashed to find a bucket — O(1) average for get/set/delete.
Key requirements: must be hashable — immutable types work (int, str, float, tuple of hashables, frozenset). Lists and dicts cannot be keys.
All Dictionary Methods
Dictionary Comprehensions
Sets — O(1) Membership Testing
A set is a mutable, unordered collection of unique hashable objects — a hash table with values but no keys. The primary reason to use a set over a list is O(1) membership testing.
Set Mathematics — Real Analytics
Comprehensions — Every Type Explained
When to Use Each Data Structure
| Need | Best choice | Why |
|------|-------------|-----|
| Ordered sequence, random access | list | O(1) index |
| Queue (FIFO) | deque | O(1) popleft |
| Stack (LIFO) | list | O(1) pop |
| Key-value store | dict | O(1) lookup |
| Membership testing (repeated) | set | O(1) in |
| Immutable record / dict key | tuple | hashable |
| Frequency counting | Counter | built-in arithmetic |
| Grouping without "if key in" | defaultdict | auto-initialise |
Project: Contact Management System
Exercises
Exercise 1 — Anagram Detector
Exercise 2 — Two-Sum (O(n) with dict)
Exercise 3 — LRU Cache
Exercise 4 — BFS Shortest Path (Graph as dict of sets)
Exercise 5 — Word Frequency Pipeline
Exercise 6 — Social Network with Set Operations
Exercise 7 — Sliding Window Maximum
Exercise 8 — Sparse Matrix as dict of dicts
Key Takeaways
- Lists are O(1) at the tail and by index — use
dequefor O(1) operations at both ends - Tuples signal immutability and are hashable — use them as dict keys and to represent fixed records
- Dict lookups, inserts, and deletes are O(1) average — always prefer a dict over a list for key-based access
- Sets give O(1) membership testing — use them whenever you repeatedly ask "is X in collection?"
defaultdicteliminates "if key not in dict" boilerplate for grouping, counting, and graph adjacencyCountersupports frequency counting, arithmetic on counts, andmost_common(n)- Comprehensions are faster than for-loops for building collections; they express intent clearly
- Generator expressions produce values lazily — prefer them when you need to iterate only once
- The
key=parameter insorted()makes multi-key sorting trivial with no auxiliary data structures - Choose your container by asking: "What operation will I do most often?" — then pick the O(1) one