English Python Python, set() Hidetoshi Honda 2025年3月24日 0 コメント

Why Is Python’s set So Fast? Understanding the Difference from list and the Power of Hash Tables

Contents

1 Introduction

Introduction

When working with Python, checking whether a certain element exists in a collection is a common operation. For example:

if value in my_list:
    # do something

At a glance, this code seems simple and efficient. However, when working with large datasets, it can become very slow.

Surprisingly, just by switching from a list to a set, the execution time can drop from hundreds of seconds to just a few milliseconds.

In this article, we’ll explore why set can be so much faster, digging into the underlying data structures and illustrating the difference with a fun example from Dragon Ball.

List vs Set: The Basics

Feature	`list`	`set`
Data structure	Array	Hash table
Order maintained	Yes	No
Duplicates allowed	Yes	No
Lookup speed (avg)	O(n)	O(1)

Why is `list` slow?

Python’s list is essentially an array. When you write something like value in my_list, Python has to scan each element one by one until it finds a match — or reaches the end.

For example:

a = [1, 3, 5, 7, 9]
print(5 in a)  # → Python checks 1 → 3 → 5 (found!)

This linear search is fine for small lists. But as the number of elements increases (e.g. 100,000+), performance drops significantly.

Why is `set` fast?

Python’s set, on the other hand, is implemented using a hash table. This means each element is converted internally into a hash value, which acts like a “map” to its location in memory.

Here’s a simplified view:

"apple" → hash function → 42
Store "apple" in bucket #42
On lookup: "apple" → 42 → direct access!

Because it skips the full scan and jumps straight to the relevant location (on average), value in my_set is blazingly fast.

Experiment: Goku vs Bulma – A Lookup Showdown

To demonstrate the difference, let’s imagine a world where Goku searches for Dragon Balls manually (like a list), while Bulma uses the high-tech Dragon Radar (like a set).

import time
from random import randint

print("Starting Dragon Ball search experiment!\n")

# --- World Dragon Balls (Data Setup) ---
world_dragonballs = [randint(1, 10_000_000) for _ in range(100_000)]
radar_signals = [randint(1, 10_000_000) for _ in range(100_000)]

# --- Goku's Manual Search (list) ---
print('Goku: "Alright! I’ll search every single one!" (list)')

start = time.time()
found_list = sum(1 for signal in radar_signals if signal in world_dragonballs)
elapsed_list = time.time() - start

print(f"[List Search] Found: {found_list}, Time: {elapsed_list:.3f}s")

# --- Bulma's Radar (set) ---
print('\nBulma: "Goku, just use the Dragon Radar!" (set)')

dragon_radar = set(world_dragonballs)

start = time.time()
found_set = sum(1 for signal in radar_signals if signal in dragon_radar)
elapsed_set = time.time() - start

print(f"[Set Search] Found: {found_set}, Time: {elapsed_set:.3f}s")

# --- Result ---
print("\nResult:")
if elapsed_list > elapsed_set:
    print(f"Bulma’s Dragon Radar was {round(elapsed_list / elapsed_set)}x faster than Goku’s manual search!")
else:
    print("Unbelievable... Goku’s raw effort actually won this time?")

Sample Output:

Starting Dragon Ball search experiment!

Goku: "Alright! I’ll search every single one!" (list)
[List Search] Found: 987, Time: 83.714s

Bulma: "Goku, just use the Dragon Radar!" (set)
[Set Search] Found: 987, Time: 0.009s

Result:
Bulma’s Dragon Radar was 9155x faster than Goku’s manual search!

Summary

In Python, set is a powerful choice when:

You need to check existence quickly
You don’t care about the order of elements
You want to remove duplicates

Use list when:

You need to maintain order
You want to store duplicates
You’re doing operations that depend on element positions

Conclusion

Python’s set is more than just a container that rejects duplicates — it’s a high-performance tool thanks to its underlying hash table.

If you ever find yourself writing slow in checks over large lists, give set a try. You might just save thousands of milliseconds — or in Goku’s case, thousands of searches.

References

Python Official Documentation – set

Engineering Journal | Hidetoshi Honda

Why Is Python’s set So Fast? Understanding the Difference from list and the Power of Hash Tables

Introduction

List vs Set: The Basics

Why is `list` slow?

Why is `set` fast?

Experiment: Goku vs Bulma – A Lookup Showdown

Sample Output:

Summary

Conclusion

References

コメントを送信コメントをキャンセル

You May Have Missed

Introduction

List vs Set: The Basics

Why is list slow?

Why is set fast?

Experiment: Goku vs Bulma – A Lookup Showdown

Sample Output:

Summary

Conclusion

References

Understanding Python’s map() Function: Simple Yet Powerful

【Python】なぜ set は高速なのか？ 〜listとの違いとハッシュテーブルの仕組み〜

Related Posts

コメントを送信 コメントをキャンセル

You May Have Missed

Why is `list` slow?

Why is `set` fast?

【Python】なぜ set は高速なのか？〜listとの違いとハッシュテーブルの仕組み〜

コメントを送信コメントをキャンセル