Swap the order of @trace_method and @raise_on_invalid_id decorators
across all service managers so that @trace_method is always the first
wrapper applied to the function (positioned directly above the method).
This ensures the ID validation happens before tracing begins, which is
the intended execution order.
Files modified:
- agent_manager.py (23 occurrences)
- archive_manager.py (11 occurrences)
- block_manager.py (7 occurrences)
- file_manager.py (6 occurrences)
- group_manager.py (9 occurrences)
- identity_manager.py (10 occurrences)
- job_manager.py (7 occurrences)
- message_manager.py (2 occurrences)
- provider_manager.py (3 occurrences)
- sandbox_config_manager.py (7 occurrences)
- source_manager.py (5 occurrences)
- step_manager.py (13 occurrences)
Problem: When listing files with status checking enabled, the code used
asyncio.gather to check and update status for all files concurrently. Each
status check may update the file in the database (e.g., for timeouts or
embedding completion), leading to N concurrent database connections.
Example: Listing 100 files with status checking creates 100 simultaneous
database update operations, exhausting the connection pool.
Root cause: asyncio.gather(*[check_and_update_file_status(f) for f in files])
processes all files concurrently, each potentially creating DB updates.
Solution: Check and update file status sequentially instead of concurrently.
While this is slower, it prevents database connection pool exhaustion when
listing many files.
Changes:
- apps/core/letta/services/file_manager.py:
- Replaced asyncio.gather with sequential for loop
- Added explanatory comment about db pool exhaustion prevention
Impact: With 100 files:
- Before: Up to 100 concurrent DB connections (pool exhaustion)
- After: 1 DB connection at a time (no pool exhaustion)
Note: This follows the same pattern as PR #6617 and #6619 which fixed
similar issues in file attachment and multi-agent tool execution.
* claude coded first pass
* fix test cases to expect errors instead
* fix this
* let's see how letta-code did
* claude
* fix tests, remove dangling comments, retrofit all managers functions with decorator
* revert to main for these since we are not erroring on invalid tool and block ids
* reorder decorators
* finish refactoring test cases
* reorder agent_manager decorators and fix test tool manager
* add decorator on missing managers
* fix id sources
* remove redundant check
* uses enum now
* move to enum
* remove apps/core and apps/fern
* fix precommit
* add submodule updates in workflows
* submodule
* remove core tests
* update core revision
* Add submodules: true to all GitHub workflows
- Ensure all workflows can access git submodules
- Add submodules support to deployment, test, and CI workflows
- Fix YAML syntax issues in workflow files
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
* remove core-lint
* upgrade core with latest main of oss
---------
Co-authored-by: Claude <noreply@anthropic.com>
* base requirements
* autofix
* Configure ruff for Python linting and formatting
- Set up minimal ruff configuration with basic checks (E, W, F, I)
- Add temporary ignores for common issues during migration
- Configure pre-commit hooks to use ruff with pass_filenames
- This enables gradual migration from black to ruff
* Delete sdj
* autofixed only
* migrate lint action
* more autofixed
* more fixes
* change precommit
* try changing the hook
* try this stuff