Online Tools Toolshu.com Log In Sign Up

How Much Faster Are Python List Comprehensions Than For Loops — and When Does It Not Matter?

Author:bhnw Released on 2026-04-25 10:26 5 views Star (0)

List Comprehensions Are "Faster" — But Faster How, When, and When Does It Not Matter at All?

Search online and you'll find plenty of articles claiming list comprehensions are faster than for loops. Most of them run a quick timeit, show that comprehensions are 2x faster, say "use comprehensions," and call it a day.

But if you've actually written Python, you've probably noticed: switching to a list comprehension sometimes makes no measurable difference. That's not a mistake. There's a real reason for it.


The Conclusion First, Then the Why

  • Simple operations (multiply by 2, add 1, type conversion): list comprehensions are 20%-50% faster, sometimes approaching 2x
  • Complex operations (function calls, regex, I/O, database queries): basically no difference, maybe a few milliseconds
  • Very large data with memory pressure: list comprehensions are actually the problem — generator expressions are the right tool

Why Are List Comprehensions Faster for Simple Operations?

Look at the bytecode with the dis module and it becomes obvious:

import dis

def for_loop():
    result = []
    for i in range(10):
        result.append(i * 2)
    return result

def list_comp():
    return [i * 2 for i in range(10)]

dis.dis(for_loop)
dis.dis(list_comp)

In the for loop bytecode, every single iteration does:

  1. LOAD_FAST: look up the result variable
  2. LOAD_ATTR: look up the .append attribute — this is the expensive part
  3. CALL_FUNCTION: call append

The list comprehension uses a dedicated bytecode instruction called LIST_APPEND, which bypasses the attribute lookup and operates directly at the C level. Run 100,000 iterations, and you've saved 100,000 attribute lookups. That's where the gap comes from.

Here's a benchmark to see it:

import timeit

def for_loop(data):
    result = []
    for x in data:
        result.append(x * 2)
    return result

def list_comp(data):
    return [x * 2 for x in data]

data = list(range(1_000_000))

t1 = timeit.timeit(lambda: for_loop(data), number=50)
t2 = timeit.timeit(lambda: list_comp(data), number=50)

print(f"for loop:  {t1:.3f}s")
print(f"list comp: {t2:.3f}s")
print(f"speedup: {t1/t2:.2f}x")
# Typical output:
# for loop:  2.341s
# list comp: 1.487s
# speedup: 1.57x

Why Does It Not Matter for Complex Operations?

Simple: when the operation itself takes far longer than the loop overhead, the time saved on attribute lookups is noise.

import math
import timeit

def heavy_for(data):
    result = []
    for x in data:
        result.append(math.sqrt(x) ** 2.5 + math.log(x + 1))
    return result

def heavy_comp(data):
    return [math.sqrt(x) ** 2.5 + math.log(x + 1) for x in data]

data = list(range(1, 100_001))

t1 = timeit.timeit(lambda: heavy_for(data), number=20)
t2 = timeit.timeit(lambda: heavy_comp(data), number=20)

print(f"for loop:  {t1:.3f}s")
print(f"list comp: {t2:.3f}s")
# The gap is typically under 5% — effectively irrelevant

If the loop body involves network requests, database queries, or file I/O, the difference is even more irrelevant. Network latency runs in the tens of milliseconds. The overhead of .append doesn't register.


Memory: The Hidden Cost of List Comprehensions

List comprehensions are eager — they build the entire result in memory the moment they run.

# This line immediately consumes hundreds of MB
big_list = [x * 2 for x in range(10_000_000)]

# A generator expression is lazy — almost no memory
big_gen = (x * 2 for x in range(10_000_000))

import sys
print(sys.getsizeof(big_list))  # ~85 MB
print(sys.getsizeof(big_gen))   # ~200 bytes

If you only need to iterate over the result once — summing, finding the max, passing to another function — you don't need a list at all. Use a generator expression:

# No need to build a full list
total = sum(x * 2 for x in range(10_000_000))       # memory-friendly
maximum = max(x ** 2 for x in range(10_000_000))    # memory-friendly

# Instead of:
total = sum([x * 2 for x in range(10_000_000)])     # builds the full list first, wasteful

When Not to Use List Comprehensions

When the logic gets complex. Once a comprehension has more than two conditions or nested loops, readability collapses. A for loop is better here:

# Hard to read at a glance
result = [x for sublist in matrix for x in sublist if x > 0 if x % 2 == 0]

# A for loop at least allows comments
result = []
for sublist in matrix:
    for x in sublist:
        if x > 0 and x % 2 == 0:
            result.append(x)

When you need to break early. List comprehensions can't break. If you want the first matching element, use next() or a for loop:

data = range(1_000_000)

# Wrong: iterates all one million elements just to grab the first result
first = [x for x in data if x > 100][0]

# Right: stops at the first match
first = next(x for x in data if x > 100)

When you need exception handling. You can't put try-except inside a comprehension:

def safe_int(s):
    try:
        return int(s)
    except ValueError:
        return None

data = ["1", "abc", "3", "xyz"]
# This works — exception handling is wrapped in a function
result = [safe_int(x) for x in data]

# But you can't write try-except directly inside the comprehension itself

Summary

Situation What to use
Simple transformation, result goes into a list List comprehension
Only need to iterate once (sum, max, etc.) Generator expression
Very large data, memory is a concern Generator expression
Complex operation (function calls, I/O) Either — prioritize readability
Complex logic that needs comments for loop
Need to break or continue for loop
Need try-except for loop, or wrap in a function

To run these benchmarks yourself and see the numbers, paste the code into the Python Online Runner. Python 3.12 environment, timeit and dis both included.

发现周边 发现周边
Comment area

Loading...