Problem
Backend engineers make decisions every day that implicitly involve complexity analysis: "Should I fetch all users and filter in code, or add a WHERE clause?" "Should I sort in the application or the database?" Without formal complexity analysis, these decisions are guesses.
Why It Matters
O(N) filtering on 100 rows: 100 comparisons ā 0.1ms
O(N) filtering on 10M rows: 10,000,000 comparisons ā 10,000ms (10 seconds)
O(log N) index lookup on 10M rows: ~23 comparisons ā 0.023msThe difference between O(N) and O(log N) is the difference between "works" and "doesn't work" at scale.
Mental Model
Big-O describes how the number of operations grows as input size N grows, ignoring constant factors.
O(1): 1 operation regardless of N ā hash map lookup
O(log N): doubles work when N squares ā binary search, B-tree
O(N): work scales linearly with N ā linear scan
O(N log N): N Ć log N ā merge sort, index build
O(N²): work squares when N doubles ā nested loops, naive JOIN
O(2^N): doubles with each added element ā brute-force combinatoricsBackend-Specific Examples
Operation Complexity Why
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
Hash map (dict) lookup O(1) Direct memory address
N+1 query loop O(N) queries One query per row
LIMIT/OFFSET pagination (page 1000) O(OFFSET) DB scans skipped rows
Cursor-based pagination O(1) WHERE id > cursor
Unindexed search on table of N rows O(N) Full table scan
Indexed search on table of N rows O(log N) B-tree traversal
JOIN on NĆM rows without index O(NĆM) Nested loop join
JOIN on NĆM rows with index O(N log M) Index lookup per left row
Sorting N items O(N log N) Best possible comparison sort
Building an inverted index on N words O(N) One pass
Redis LRANGE on list of N items O(S+N) S=start offset, N=returned
Redis KEYS pattern match on N keys O(N) Full keyspace scanComplexity Analysis
Space Complexity
Space complexity measures memory usage as a function of input:
def n_plus_one(users): # O(1) space (iterates, doesn't store all posts)
for user in users:
posts = db.query(user.id) # fetched, used, discarded
display(posts)
def load_all_in_memory(users): # O(NĆM) space ā stores all posts in memory
return {user.id: db.query(user.id) for user in users}Amortized Complexity
Some operations are occasionally expensive but cheap on average. Python's list.append():
# list.append() is O(1) amortized
# When list is full, it doubles in size: O(N) copy
# But doublings happen at 1, 2, 4, 8, ... ā total copies for N appends: O(N)
# Amortized per-operation: O(N)/N = O(1)Redis LPUSH is O(1). Building an N-element sorted set with ZADD N times is O(N log N).
When to Worry About Complexity
N < 1,000: Algorithm choice rarely matters. Optimize for readability.
N = 10,000: O(N²) becomes painful (100M operations). Audit hot loops.
N = 100,000: O(N log N) is fine (1.7M operations). O(N²) is broken.
N = 10M: O(N) requires care (10M operations). O(log N) preferred.
N = 1B: O(N) is architecture-level work. Only O(log N) or O(1) is safe.Benchmark
Operation on N=1,000,000 items:
O(1) hash lookup: 0.0001ms (1 operation)
O(log N) B-tree lookup: 0.02ms (20 operations)
O(N) linear scan: 100ms (1M operations @ 100ns each)
O(N log N) sort: 2,000ms (20M operations)
O(N²) naive distinct: 100,000ms (1T operations ā never do this)Failure Modes
Accidental O(N²) in loops:
# O(N²) ā creates a new list comprehension inside a loop
for user in users: # O(N)
if user.id in [u.id for u in banned]: # O(N) per iteration
block(user)
# Fix: convert to O(1) lookup first
banned_ids = {u.id for u in banned} # O(N) once
for user in users: # O(N)
if user.id in banned_ids: # O(1) per iteration
block(user)Key Takeaways
- O(1) = constant time. O(log N) = index scan. O(N) = full scan. O(N²) = always fix this.
- N+1 queries are O(N) queries ā the algorithm is wrong, not just slow.
- Always analyze the bottleneck operation, not just the total function.
- Amortized complexity matters: Python list append is O(1) amortized, not O(N).
- For N > 10,000, algorithm choice dominates hardware choice.