letta-server

Author	SHA1	Message	Date
jnjpng	eb4a0daabd	fix: allow explicit null for max_tokens on GPT-5 models (#9562 ) The Pydantic validator `set_model_specific_defaults` was checking `values.get("max_tokens") is None`, which matched both "field not provided" and "field explicitly set to null". This meant users could not disable the max output tokens limit for GPT-5/GPT-4.1 models - the validator would always override null with a default value during request deserialization. Changed to `"max_tokens" not in values` so that an explicit `max_tokens: null` is preserved while still applying defaults when the field is omitted entirely.	2026-02-24 10:55:12 -08:00
jnjpng	828c89c76f	fix: populate max_tokens when listing LLM models (#9559 ) list_llm_models_async was constructing LLMConfig without max_tokens, causing the GET /models/ endpoint to return null for max_tokens. Now calls typed_provider.get_default_max_output_tokens() for both base and BYOK provider paths, matching get_llm_config_from_handle.	2026-02-24 10:55:12 -08:00
Kevin Lin	bd5b5fa9f3	feat(gemini): add 3.1 pro preview support (#9553 ) Add 3.1 model metadata for Google AI and update Gemini tests/examples to use the new handle. 👾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:55:11 -08:00
cthomas	e2ad8762fe	fix: gemini streaming bug (#9555 )	2026-02-24 10:55:11 -08:00
cthomas	8ffc515674	fix: flip parallel_tool_calls setting default (#9541 )	2026-02-24 10:55:11 -08:00
cthomas	3cdd64dc24	chore: update keepalive interval 50->20 (#9538 ) * chore: update keepalive interval 50->20 * update comment	2026-02-24 10:55:11 -08:00
Kian Jones	8f56527958	perf(memfs): delta upload — only push new git objects after commit (#9548 ) perf(memfs): delta upload — only push new/modified git objects after commit Instead of re-uploading the entire .git/ directory after every commit, snapshot file mtimes before the commit and only upload files that are new or changed. A typical single-block update creates ~5 new objects (blob, trees, commit, ref) vs re-uploading all ~30. Full _upload_repo retained for create_repo and other paths that need it. 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:55:11 -08:00
Charles Packer	044241daec	fix(core): include effort in AnthropicModelSettings returned by _to_model_settings() (#9543 ) LlmConfig._to_model_settings() for Anthropic built an AnthropicModelSettings object without passing effort=self.effort, so GET /agents/{id} never returned the effort field in model_settings even when it was stored on the agent. The Letta Code CLI derives the reasoning tier displayed in the status bar from model_settings.effort (canonical source), so the footer always showed e.g. "Sonnet 4.6" instead of "Sonnet 4.6 (high)" after a model switch. 👾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:55:11 -08:00
Kian Jones	e65795b5f1	fix(core): handle None message_ids in context window calculator (#9330 ) * fix(core): always create system message even with _init_with_no_messages When _init_with_no_messages=True (used by agent import flows), the agent was created with message_ids=None. If subsequent message initialization failed, this left orphaned agents that crash when context window is calculated (TypeError on message_ids[1:]). Now the system message is always generated and persisted, even when skipping the rest of the initial message sequence. This ensures every agent has at least message_ids=[system_message_id]. Fixes Datadog issue 773a24ea-eeb3-11f0-8f9f-da7ad0900000 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(core): clean up placeholder messages during import and add test Delete placeholder system messages after imported messages are successfully created (not before), so agents retain their safety-net system message if import fails. Also adds a test verifying that _init_with_no_messages=True still produces a valid context window. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(core): add descriptive error for empty message_ids in get_system_message 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:55:11 -08:00
jnjpng	e8d5922ff9	fix(core): handle ResponseIncompleteEvent in OpenAI Responses API streaming (#9535 ) * fix(core): handle ResponseIncompleteEvent in OpenAI Responses API streaming When reasoning models (gpt-5.x) exhaust their max_output_tokens budget on chain-of-thought reasoning, OpenAI emits a ResponseIncompleteEvent instead of ResponseCompletedEvent. This was previously unhandled, causing final_response to remain None — which meant get_content() and get_tool_call_objects() returned empty results, silently dropping the partial response. Now ResponseIncompleteEvent is handled identically to ResponseCompletedEvent (extracting partial content, usage stats, and token details), with an additional warning log indicating the incomplete reason. * fix(core): propagate finish_reason for Responses API incomplete events - Guard usage extraction against None usage payload in ResponseIncompleteEvent handler - Add _finish_reason override to LettaLLMAdapter so streaming adapters can explicitly set finish_reason without a chat_completions_response - Map incomplete_details.reason="max_output_tokens" to finish_reason="length" in SimpleLLMStreamAdapter, matching the Chat Completions API convention - This allows the agent loop's _decide_continuation to correctly return stop_reason="max_tokens_exceeded" instead of "end_turn" when the model exhausts its output token budget on reasoning * fix(core): handle empty content parts in incomplete ResponseOutputMessage When a model hits max_output_tokens after starting a ResponseOutputMessage but before producing any content parts, the message has content=[]. This previously raised ValueError("Got 0 content parts, expected 1"). Now it logs a warning and skips the empty message, allowing reasoning-only incomplete responses to be processed cleanly. * fix(core): map all incomplete reasons to finish_reason, not just max_output_tokens Handle content_filter and any future unknown incomplete reasons from the Responses API instead of silently leaving finish_reason as None.	2026-02-24 10:55:11 -08:00
Ari Webb	5896e5d023	fix: logging for credit verification step (#9514 )	2026-02-24 10:55:11 -08:00
cthomas	3651658ea7	fix: tool call streaming using deprecated field (#9517 )	2026-02-24 10:55:11 -08:00
Ari Webb	21765d16c9	fix(core): add OpenAI 24h prompt cache retention for supported models (#9509 ) * fix(core): add OpenAI prompt cache key and model-gated 24h retention (#9492) * fix(core): apply OpenAI prompt cache settings to request payloads Set prompt_cache_key using agent and conversation context on both Responses and Chat Completions request builders, and enable 24h retention only for supported OpenAI models while excluding OpenRouter paths. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(core): prefix prompt cache key with letta tag Add a `letta:` prefix to generated OpenAI prompt_cache_key values so cache-related entries are easier to identify in provider-side logs and diagnostics. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * add integration test * skip test --------- Co-authored-by: Letta <noreply@letta.com> Co-authored-by: Ari Webb <ari@letta.com> * fix(core): only set prompt_cache_retention, drop prompt_cache_key Two issues with the original prompt_cache_key approach: 1. Key exceeded 64-char max (agent-<uuid>:conv-<uuid> = 90 chars) 2. Setting an explicit key disrupted OpenAI's default prefix-hash routing, dropping cache hit rates from 40-45% to 10-13% OpenAI's default routing (hash of first ~256 tokens) already provides good cache affinity since each agent has a unique system prompt. We only need prompt_cache_retention="24h" for extended retention. Also fixes: - Operator precedence bug in _supports_extended_prompt_cache_retention - Removes incorrect gpt-5.2-codex exclusion (it IS supported per docs) 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Charles Packer <packercharles@gmail.com> Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:55:11 -08:00
jnjpng	042c9c36af	fix(core): add warning log for streaming chunks missing id or otid (#9513 ) Adds a diagnostic log at the streaming chokepoint in LettaAgentV3.stream() to detect when any LettaMessage chunk is yielded without an id or otid field. This helps trace the root cause of client-side id/otid inconsistencies.	2026-02-24 10:55:11 -08:00
Shubham Naik	3247fa7065	chore: fix favoriting bugs (#9505 ) * chore: fix favoriting bugs * chore: fix favoriting bugs * chore: fix favoriting bugs	2026-02-24 10:55:11 -08:00
Shubham Naik	4a829123cd	Listener mode (#9486 ) * feat: listener mdoe * feat: listener mdoe * feat: listener mdoe * feat: listener mdoe * feat: listener mdoe * feat: listener mdoe * chore: merge * feat: listen mode * feat: add some keycontrols * feat: add some keycontrols * chore: hotwire fix for core * chore: restore connection	2026-02-24 10:55:11 -08:00
Kian Jones	f5c4ab50f4	chore: add ty + pre-commit hook and repeal even more ruff rules (#9504 ) * auto fixes * auto fix pt2 and transitive deps and undefined var checking locals() * manual fixes (ignored or letta-code fixed) * fix circular import * remove all ignores, add FastAPI rules and Ruff rules * add ty and precommit * ruff stuff * ty check fixes * ty check fixes pt 2 * error on invalid	2026-02-24 10:55:11 -08:00
Devansh Jain	39ddda81cc	feat: add Anthropic Sonnet 4.6 (#9408 )	2026-02-24 10:55:11 -08:00
Kian Jones	25d54dd896	chore: enable F821, F401, W293 (#9503 ) * auto fixes * auto fix pt2 and transitive deps and undefined var checking locals() * manual fixes (ignored or letta-code fixed) * fix circular import	2026-02-24 10:55:08 -08:00
Ari Webb	fa70e09963	Revert "fix(core): add OpenAI prompt cache key and model-gated 24h retention" (#9502 ) Revert "fix(core): add OpenAI prompt cache key and model-gated 24h retention …" This reverts commit f5bb9c629cb7d45544e90758cdfb899bcef41912.	2026-02-24 10:52:07 -08:00
Sarah Wooders	2bf3314cef	fix: import asyncio for parallel tool calls (#9501 )	2026-02-24 10:52:07 -08:00
Shubham Naik	20c71523f8	chore: hotwire fix for core (#9500 )	2026-02-24 10:52:07 -08:00
Shubham Naik	e66981c7e8	feat: update undertaker to use rate limiter (#9498 )	2026-02-24 10:52:07 -08:00
Charles Packer	619e81ed1e	fix(core): add OpenAI prompt cache key and model-gated 24h retention (#9492 ) * fix(core): apply OpenAI prompt cache settings to request payloads Set prompt_cache_key using agent and conversation context on both Responses and Chat Completions request builders, and enable 24h retention only for supported OpenAI models while excluding OpenRouter paths. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(core): prefix prompt cache key with letta tag Add a `letta:` prefix to generated OpenAI prompt_cache_key values so cache-related entries are easier to identify in provider-side logs and diagnostics. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * add integration test * skip test --------- Co-authored-by: Letta <noreply@letta.com> Co-authored-by: Ari Webb <ari@letta.com>	2026-02-24 10:52:07 -08:00
jnjpng	5b001a7749	fix: rename ChatGPT server error to ChatGPT API error (#9497 ) fix: rename ChatGPT server error to ChatGPT API error in error messages	2026-02-24 10:52:07 -08:00
jnjpng	fbc0bb60d9	fix: retry ChatGPT 502 and upstream connection errors with exponential backoff (#9495 ) 502s and upstream connection errors (envoy proxy failures) from ChatGPT were not being retried. This classifies them as LLMConnectionError (retryable) in both the streaming and non-streaming paths, and adds retry handling in the non-streaming HTTPStatusError handler so 502s get the same exponential backoff treatment as transport-level connection drops. 🐾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Sarah Wooders	26cbdb7b7b	fix(core): skip malformed send_message entries in message conversion (#9494 ) Avoid failing message-list endpoints when historical send_message tool calls are missing the expected message argument by logging and skipping malformed entries during conversion. 👾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Charles Packer	c32d53f8a3	fix(core): remove old static landing page from Docker image (#9369 ) The "Experience the new ADE" page was outdated and no longer useful. Root path now redirects to /docs (FastAPI Swagger UI) instead. 👾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
amysguan	80a0d1a95f	Add LLM client compaction errors to traces (#9474 ) * add llm client errors to traces * update response json for telemetry * prevent silent failures and properly log errored responses in streaming path * remove double logging --------- Co-authored-by: Amy Guan <amy@letta.com> Co-authored-by: Kian Jones <kian@letta.com>	2026-02-24 10:52:07 -08:00
Shubham Naik	2f76f2c629	Agent page update (#9475 ) * feat: agent page updates * feat: agent page updates * feat: agent page updates * feat: agent page updates * feat: agent page updates * feat: agent page updates * chore: fix code * chore: fix code	2026-02-24 10:52:07 -08:00
jnjpng	e3eafb1977	fix: re-raise LLMError before wrapping with handle_llm_error (#9482 ) LLMError exceptions are already properly formatted errors that should propagate directly. Without this check, they get unnecessarily wrapped by handle_llm_error, losing their original error information.	2026-02-24 10:52:07 -08:00
Kian Jones	2f0294165c	debug: log statement_timeout + connection pid on session checkout (#9472 ) * debug: log statement_timeout + connection pid on every session checkout Temporary instrumentation to diagnose why some PlanetScale connections have statement_timeout=5s while others have 0 (disabled). 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * debug: log statement_timeout on every checkout, not just non-zero 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix: rollback implicit transaction from debug query The SELECT implicitly begins a transaction, causing "A transaction is already begun" errors for code that calls session.begin() explicitly. 🤖 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Ari Webb	0a8a8fda54	feat: add credit verification before agent message endpoints [LET-XXXX] (#9433 ) * feat: add credit verification before agent message endpoints Add credit verification checks to message endpoints to prevent execution when organizations have insufficient credits. - Add InsufficientCreditsError exception type - Add CreditVerificationService that calls step-orchestrator API - Add credit checks to /agents/{id}/messages endpoints - Add credit checks to /conversations/{id}/messages endpoint 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * surface error in ade * do per step instead * parallel check * parallel to step * small fixes * stage publish api * fixes * revert unnecessary frontend changes * insufficient credits stop reason --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Ari Webb	5faec5632f	fix: add m2.5 (#9480 ) * fix: add m2.5 * fix test	2026-02-24 10:52:07 -08:00
amysguan	9bec8c64f5	New prompts/defaults for `sliding_window` and `all` compaction (#9444 ) * new prompts for sliding window and all compaction + defaults to corresponding prompt * regenerate api spec --------- Co-authored-by: Amy Guan <amy@letta.com>	2026-02-24 10:52:07 -08:00
github-actions[bot]	0b08164cc2	fix: update system prompt metadata label to "System prompt last recompiled" (#9477 ) fix: update system prompt metadata label from "Memory blocks were last modified" to "System prompt last recompiled" When git-based memory is enabled, there are no memory blocks, so the label "Memory blocks were last modified" is inaccurate. Changed to "System prompt last recompiled" which accurately reflects the timestamp meaning. Fixes #9476 🐾 Generated with [Letta Code](https://letta.com) Co-authored-by: letta-code <248085862+letta-code@users.noreply.github.com> Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Kian Jones	f55ff3a822	fix(core): descriptive error for empty message_ids instead of IndexError (#9464 ) fix(core): replace IndexError with descriptive error for empty message_ids Agent with no in-context messages (empty/None message_ids) would crash with a cryptic IndexError on message_ids[0]. Now raises a clear LettaError explaining the system message was not initialized. Datadog: https://us5.datadoghq.com/error-tracking/issue/6c061c28-0830-11f1-b060-da7ad0900000 🐾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Kian Jones	80f34f134d	fix(core): catch bare openai.APIError in handle_llm_error (#9468 ) * fix(core): catch bare openai.APIError in handle_llm_error fallthrough openai.APIError raised during streaming (e.g. OpenRouter credit exhaustion) is not an APIStatusError, so it skipped the catch-all at the end and fell through to LLMError("Unhandled"). Now bare APIErrors that aren't context window overflows are mapped to LLMBadRequestError. Datadog: https://us5.datadoghq.com/error-tracking/issue/7a2c356c-0849-11f1-be66-da7ad0900000 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * feat(core): add LLMInsufficientCreditsError for BYOK credit exhaustion Adds dedicated error type for insufficient credits/quota across all providers (OpenAI, Anthropic, Google). Returns HTTP 402 with BYOK-aware messaging instead of generic 400. - New LLMInsufficientCreditsError class and PAYMENT_REQUIRED ErrorCode - is_insufficient_credits_message() helper detecting credit/quota strings - All 3 provider clients detect 402 status + credit keywords - FastAPI handler returns 402 with "your API key" vs generic messaging - 5 new parametrized tests covering OpenRouter, OpenAI, and negative case 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Ari Webb	cfd2ca3102	fix: zai clear empty messages (#9466 )	2026-02-24 10:52:07 -08:00
Shubham Naik	6579f9b906	feat: add saveTemplateVersionNoProject endpoint (#9465 ) * feat: add saveTemplateVersionNoProject endpoint Added NoProject version of saveTemplateVersion endpoint: - Backend supports both URL patterns: - `/v1/templates/:project_id/:template_name` (with project in path) - `/v1/templates/:template_name` (NoProject, uses X-Project header) - Stainless surfaces only the cleaner NoProject version as `templates.save()` - NoProject route exported first for correct route matching Changes: - Added saveTemplateVersionNoProject contract and handler - Updated stainless.yml to map `save` method to NoProject endpoint - Follows same pattern as other NoProject endpoints (create, delete, rollback) 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * bump * bump --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Shubham Naik	565fd3c143	feat: add template rollback endpoint [LET-7423] (#9455 ) * feat: add template rollback endpoint [LET-7423] Adds POST /v1/templates/:template_name/rollback endpoint to restore templates to previous versions. Key features: - Rollback to any numbered version (1, 2, 3, etc.) or "latest" - Auto-saves unsaved changes before rollback to prevent data loss - Validates input (rejects "current"/"dev" as target versions) - Preserves entity IDs and relationships across rollback - Uses project context from X-Project header (no project_id in path) Implementation includes: - API contract in templatesContract.ts - Handler in templatesRouter.ts with comprehensive error handling - 9 E2E tests covering functionality and edge cases - Updated stainless.yml for SDK generation 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore test * fix: add X-Project header to rollback endpoint tests The rollback endpoint uses project context from X-Project header instead of URL path. Updated all rollback test calls to include the X-Project header with testProject value. This follows the no-project-in-path pattern for template endpoints. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * feat: support both URL patterns for rollback endpoint Added dual URL pattern support for rollback endpoint: - `/v1/templates/:project_id/:template_name/rollback` (with project in path) - `/v1/templates/:template_name/rollback` (NoProject, uses X-Project header) Backend supports both patterns, but Stainless only exposes the cleaner NoProject version for SDKs. Key changes: - Fixed "rollback to latest" bug by resolving target version BEFORE auto-saving - NoProject route is exported first to ensure correct route matching order - Updated tests to use project_id in path for better compatibility - All 8 rollback tests passing 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * bump * bump * bump --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
jnjpng	778f28ccf3	fix: handle transient network errors in ChatGPT OAuth client (#9462 ) - Map httpx.ReadError/WriteError/ConnectError to LLMConnectionError in handle_llm_error so Temporal correctly classifies them as retryable (previously fell through to generic non-retryable LLMError) - Add client-level retry with exponential backoff (up to 3 attempts) on request_async and stream_async for transient transport errors - Stream retry is guarded by has_yielded flag to avoid corrupting partial responses already consumed by the caller	2026-02-24 10:52:07 -08:00
Kian Jones	4126fdadea	fix(core): preserve thought_signature on TextContent in Gemini streaming path (#9461 ) get_content() was only setting signature on ReasoningContent items. When Gemini returns a function call with thought_signature but no ReasoningContent (e.g. include_thoughts=False), the signature was stored on self.thinking_signature but never attached to TextContent. This caused "missing thought_signature in functionCall parts" errors when the message was echoed back to Gemini on the next turn. 🐾 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Kian Jones	b9c4ed3b15	fix: catch contextwindowexceeded error on gemini (#9450 ) * catch contextwindowexceeded error * fix(core): detect Google token limit errors as ContextWindowExceededError Google's error message says "input token count exceeds the maximum number of tokens allowed" which doesn't contain the word "context", so it was falling through to generic LLMBadRequestError instead of ContextWindowExceededError. This means compaction won't auto-trigger. Expands the detection to also match "token count" and "tokens allowed" in addition to the existing "context" keyword. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(core): add missing message arg to LLMBadRequestError in OpenAI client The generic 400 path in handle_llm_error was constructing LLMBadRequestError without the required message positional arg, causing TypeError in prod during summarization. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * ci: add adapters/ test suite to core unit test matrix 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(tests): update adapter error handling test expectations to match actual behavior The streaming adapter's error handling double-wraps errors: the AnthropicStreamingInterface calls handle_llm_error first, then the adapter catches the result and calls handle_llm_error again, which falls through to the base class LLMError. Updated test expectations to match this behavior. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(core): prevent double-wrapping of LLMError in stream adapter The AnthropicStreamingInterface.process() already transforms raw provider errors into LLMError subtypes via handle_llm_error. The adapter was catching the result and calling handle_llm_error again, which didn't recognize the already-transformed LLMError and wrapped it in a generic LLMError("Unhandled LLM error"). This downgraded specific error types (LLMConnectionError, LLMServerError, etc.) and broke retry logic that matches on specific subtypes. Now the adapter checks if the error is already an LLMError and re-raises it as-is. Tests restored to original correct expectations. 🐾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Sarah Wooders	05073ba837	fix(core): preserve git-memory formatting and enforce lock conflicts (#9451 ) * test(core): strengthen git-memory system prompt stability integration coverage Switch git-memory HTTP integration tests to OpenAI model handles and add assertions that system prompt content remains stable after normal turns and direct block value updates until explicit recompilation or reset. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(core): preserve git-memory formatting and enforce lock conflicts Preserve existing markdown frontmatter formatting on block updates while still ensuring required metadata fields exist, and make post-push git sync propagate memory-repo lock conflicts as 409 responses. Also enable slash-containing core-memory block labels in route params and add regression coverage. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(memfs): fail closed on memory repo lock contention Make memfs git commits fail closed when the per-agent Redis lock cannot be acquired, return 409 MEMORY_REPO_BUSY from the memfs files write API, and map that 409 back to core MemoryRepoBusyError so API callers receive consistent busy conflicts. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore(core): minimize git-memory fix scope to memfs lock and frontmatter paths 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * chore: drop unrelated changes and keep memfs-focused scope Revert branch-only changes that are not required for the memfs lock contention and frontmatter-preservation fix so the PR contains only issue-relevant files. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(memfs): lock push sync path and improve nested sync diagnostics Serialize memfs push-to-GCS sync with the same per-agent Redis lock key used by API commits, and add targeted post-push nested-block diagnostics plus a focused nested-label sync regression test for _sync_after_push. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Sarah Wooders	d7793a4474	fix(core): stabilize system prompt refresh and expand git-memory coverage (#9438 ) * fix(core): stabilize system prompt refresh and expand git-memory coverage Only rebuild system prompts on explicit refresh paths so normal turns preserve prefix-cache stability, including git/custom prompt layouts. Add integration coverage for memory filesystem tree structure and recompile/reset system-message updates via message-id retrieval. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> * fix(core): recompile system prompt around compaction and stabilize source tests Force system prompt refresh before/after compaction in LettaAgentV3 so repaired system+memory state is used and persisted across subsequent turns. Update source-system prompt tests to explicitly recompile before raw preview assertions instead of assuming automatic rebuild timing. 👾 Generated with [Letta Code](https://letta.com) Co-Authored-By: Letta <noreply@letta.com> --------- Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Kian Jones	5b7dd15905	fix(core): use BYOK API keys for Google AI/Vertex LLM requests (#9439 ) GoogleAIClient and GoogleVertexClient were hardcoding Letta's managed credentials for all requests, ignoring user-provided BYOK API keys. This meant Letta was paying Google API costs for BYOK users. Add _get_client_async and update _get_client to check BYOK overrides (via get_byok_overrides / get_byok_overrides_async) before falling back to managed credentials, matching the pattern used by OpenAIClient and AnthropicClient. 🤖 Generated with [Letta Code](https://letta.com) Co-authored-by: Letta <noreply@letta.com>	2026-02-24 10:52:07 -08:00
Ari Webb	d0e25ae471	feat: add glm 5 to core (#9436 ) * feat: add glm 5 to core * test glm 5	2026-02-24 10:52:07 -08:00
Kevin Lin	7ec2783ded	fix: increase SGLang provider default context window from 8k to 32k (#9435 )	2026-02-24 10:52:07 -08:00
Kian Jones	e3dbb44fc9	feat(telem): support reading from clickhouse traces (#9431 ) draft	2026-02-24 10:52:07 -08:00

1 2 3 4 5 ...

7215 Commits