letta-server

Files

jnjpng 46971414a4 fix: preserve agent max_tokens when caller doesn't explicitly set it (#9679 )

* fix: preserve agent max_tokens when caller doesn't explicitly set it

When updating an agent with convenience fields (model, model_settings)
but without an explicit max_tokens, the server was constructing a fresh
LLMConfig via get_llm_config_from_handle_async. The Pydantic validator
on LLMConfig hardcodes max_tokens=16384 for gpt-5* models, silently
overriding the agent's existing value (e.g. 128000).

This was triggered by reasoning tab-switch in the CLI, which sends
model + model_settings (with reasoning_effort) but no max_tokens.

Now, when request.max_tokens is None we carry forward the agent's
current max_tokens instead of accepting the provider default.

* fix: use correct 128k max_output_tokens defaults for gpt-5.2/5.3

- Update OpenAI provider fallback to return 128000 for gpt-5.2*/5.3*
  models (except -chat variants which are 16k)
- Update LLMConfig Pydantic validator to match
- Update gpt-5.2 default_config factory to use 128000
- Move server-side max_tokens preservation guard into the
  model_settings branch where llm_config is already available

2026-03-03 18:34:01 -08:00

rest_api

fix: set otid for all approval request message (#9655 )

2026-03-03 18:34:01 -08:00

ws_api

chore: add ty + pre-commit hook and repeal even more ruff rules (#9504 )

2026-02-24 10:55:11 -08:00

__init__.py

merge this (#4759 )

2025-09-17 15:47:40 -07:00

constants.py

merge this (#4759 )