* first hack with test
* remove changes integration test
* Delete apps/core/tests/sdk_v1/integration/integration_test_send_message_v2.py
* add test
* remove comment
* stage and publish api
* deprecate base level response_schema
* add param to llm_config test
---------
Co-authored-by: Ari Webb <ari@letta.com>
* feat: add support for new model
* fix: just stage-api && just publish-api (anthropic model settings changed)
* fix: just stage-api && just publish-api (anthropic model settings changed)
* fix: make kevlar have default reasoning on
* fix: bump anthropic sdk version
* fix: patch name
* pin newer version anthropic
---------
Co-authored-by: Ari Webb <ari@letta.com>
Root cause: When splitting failed embedding batches, mid=0 for single
items created empty chunks. These empty chunks were then processed,
creating hundreds of no-op tasks that consumed memory.
Crash pattern from logs:
- 600+ 'batch_size=0' embedding tasks created
- Memory spiked 531 MB → 4.9 GB
- Pod crashed
Fixes:
1. Skip empty chunks before creating tasks
2. Guard chunk splits to prevent empty slices (mid = max(1, len//2))
3. Break early if all chunks are empty
This prevents the asyncio.gather() from creating thousands of empty
coroutines that exhaust memory.
* fix(core): sanitize messages to anthropic in the main path the same way (or similar) to how we do it in the token counter
* fix: also patch poison error in backend by filtering lazily
* fix: remap streaming errors (what the fuck)
* fix: dedupe tool clals
* fix: cleanup, removed try/catch
* no bare raise in anthropic client. Also filter out messages with empty content
* use client logic
* replace empty/whitespace with period
* cursor bugbot
* first hack
* clean up
* first implementation working
* revert package-lock
* remove openai test
* error throw
* typo
* Update integration_test_send_message_v2.py
* Update integration_test_send_message_v2.py
* refine test
* Only make changes for openai non streaming
* Add tests
---------
Co-authored-by: Ari Webb <ari@letta.com>
Co-authored-by: Matt Zhou <mattzh1314@gmail.com>
* refactor not to use warnings.warn
* temp circular import fix maybe unecessary/bnad
* fix Deprecation warning
* fix deprecation warning and mcp thing?
* revert changes to mcp server test
* fix deprecation warning
* feat: add full responses api support in new agent loop
* update matrix in workflow
* relax check for reasoning messages for high effort gpt 5
* fix indent
* one more relax
* fix: patch failing summarizer tests for anthropic claude 3.5
* fix: carveout for gemini-2.5-flash because it doesn't do the send_message tool call
* fix: deprecate old gemini test now that model is unavailable
* fix: deprecate old gemini test now that model is unavailable
* fix: deprecate old gemini test now that model is unavailable
* fix: patch flash flakiness
* fix: relax the gemini 2.5 flash test
* fix: relax the gemini 2.5 flash test
* fix: relax again
* fix: another flash fix
* fix: relax gpt-4o-mini
* fix: swap 4o-mini for 4.1
* fix: drop 4o-mini