fix: bound TypeAdapter lru_cache to prevent memory leak in multi-threaded usage#2907
Open
giulio-leone wants to merge 1 commit intoopenai:mainfrom
Open
fix: bound TypeAdapter lru_cache to prevent memory leak in multi-threaded usage#2907giulio-leone wants to merge 1 commit intoopenai:mainfrom
giulio-leone wants to merge 1 commit intoopenai:mainfrom
Conversation
…aded usage The unbounded lru_cache on TypeAdapter causes memory leak in multi-threaded environments. Generic types like ParsedResponseOutputMessage[MyClass] create new type objects in each thread, preventing cache hits and growing the cache without bound. Set maxsize=4096 to cap memory usage while still providing effective caching for typical workloads. Fixes openai#2672
Author
|
Friendly ping — CI is green and this is ready for review. Happy to address any feedback. Thanks! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #2672
Problem
_CachedTypeAdapteruseslru_cache(maxsize=None)(unbounded), which causes memory leak in multi-threaded environments. Whenresponses.parseis called from multiple threads, generic types likeParsedResponseOutputMessage[MyClass]are created anew by pydantic in each thread, generating distinct cache keys. This means the cache never hits and grows without bound.Root Cause
In
src/openai/_models.pyline 802:Python's
typingmodule creates new generic type objects in different threads, soParsedResponseOutputMessage[Fact]in thread A has a different identity thanParsedResponseOutputMessage[Fact]in thread B. Sincelru_cachekeys on object identity, each thread creates a new cache entry.Fix
Set
maxsize=4096to cap memory usage while still providing effective caching for typical workloads. This is a standard practice forlru_cache— the LRU eviction policy ensures the most-used types stay cached while preventing unbounded growth.Testing
Verified that the cache is bounded: