Skip to content

fix: bound TypeAdapter lru_cache to prevent memory leak in multi-threaded usage#2907

Open
giulio-leone wants to merge 1 commit intoopenai:mainfrom
giulio-leone:fix/responses-parse-memory-leak
Open

fix: bound TypeAdapter lru_cache to prevent memory leak in multi-threaded usage#2907
giulio-leone wants to merge 1 commit intoopenai:mainfrom
giulio-leone:fix/responses-parse-memory-leak

Conversation

@giulio-leone
Copy link

Summary

Fixes #2672

Problem

_CachedTypeAdapter uses lru_cache(maxsize=None) (unbounded), which causes memory leak in multi-threaded environments. When responses.parse is called from multiple threads, generic types like ParsedResponseOutputMessage[MyClass] are created anew by pydantic in each thread, generating distinct cache keys. This means the cache never hits and grows without bound.

Root Cause

In src/openai/_models.py line 802:

_CachedTypeAdapter = cast('TypeAdapter[object]', lru_cache(maxsize=None)(_TypeAdapter))

Python's typing module creates new generic type objects in different threads, so ParsedResponseOutputMessage[Fact] in thread A has a different identity than ParsedResponseOutputMessage[Fact] in thread B. Since lru_cache keys on object identity, each thread creates a new cache entry.

Fix

Set maxsize=4096 to cap memory usage while still providing effective caching for typical workloads. This is a standard practice for lru_cache — the LRU eviction policy ensures the most-used types stay cached while preventing unbounded growth.

Testing

Verified that the cache is bounded:

info = _CachedTypeAdapter.cache_info()
assert info.maxsize == 4096

…aded usage

The unbounded lru_cache on TypeAdapter causes memory leak in
multi-threaded environments. Generic types like
ParsedResponseOutputMessage[MyClass] create new type objects in each
thread, preventing cache hits and growing the cache without bound.

Set maxsize=4096 to cap memory usage while still providing effective
caching for typical workloads.

Fixes openai#2672
@giulio-leone giulio-leone requested a review from a team as a code owner February 28, 2026 16:56
@giulio-leone
Copy link
Author

Friendly ping — CI is green and this is ready for review. Happy to address any feedback. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Unrestricted caching keyed by generated types causes memory leak in multi-threaded regimes

1 participant