Files
letta-server/letta
jnjpng db9e0f42af fix(core): prevent ModelSettings default max_output_tokens from overriding agent config (#9739)
* fix(core): prevent ModelSettings default max_output_tokens from overriding agent config

When a conversation's model_settings were saved, the Pydantic default
of max_output_tokens=4096 was always persisted to the DB even when the
client never specified it. On subsequent messages, this default would
overwrite the agent's max_tokens (typically None) with 4096, silently
capping output.

Two changes:
1. Use model_dump(exclude_unset=True) when persisting model_settings
   to the DB so Pydantic defaults are not saved.
2. Add model_fields_set guards at all callsites that apply
   _to_legacy_config_params() to skip max_tokens when it was not
   explicitly provided by the caller.

Also conditionally set max_output_tokens in the OpenAI Responses API
request builder so None is not sent as null (which some models treat
as a hard 4096 cap).

* nit

* Fix model_settings serialization to preserve provider_type discriminator

Replace blanket exclude_unset=True with targeted removal of only
max_output_tokens when not explicitly set. The previous approach
stripped the provider_type field (a Literal with a default), which
broke discriminated union deserialization when reading back from DB.
2026-03-03 18:34:02 -08:00
..
2025-10-07 17:50:45 -07:00
2025-09-17 15:47:40 -07:00
2025-09-17 15:47:40 -07:00
2025-09-17 15:47:40 -07:00
2025-09-17 15:47:40 -07:00
2026-02-24 10:58:16 -08:00
2025-10-07 17:50:45 -07:00
2025-09-17 15:47:40 -07:00
2026-02-24 10:55:08 -08:00
2025-09-17 15:47:40 -07:00
2025-09-17 15:47:40 -07:00