fix: preserve max_tokens on model_settings updates without max_output_tokens (#9591)

When model_settings is sent without max_output_tokens (e.g. only
changing reasoning_effort), the Pydantic default of 4096 was being
applied via _to_legacy_config_params(), silently overwriting the
agent's existing max_tokens.

Use model_fields_set to detect when max_output_tokens was not
explicitly provided and skip overwriting max_tokens in that case.
Only applied to the update path — on create, letting the default
apply is reasonable since there's no pre-existing value.
This commit is contained in:
jnjpng
2026-02-20 12:27:08 -08:00
committed by Caren Thomas
parent 857c289ed2
commit 257b99923b

View File

@@ -717,6 +717,10 @@ class SyncServer(object):
agent = await self.agent_manager.get_agent_by_id_async(agent_id=agent_id, actor=actor) agent = await self.agent_manager.get_agent_by_id_async(agent_id=agent_id, actor=actor)
request.llm_config = agent.llm_config.model_copy() request.llm_config = agent.llm_config.model_copy()
update_llm_config_params = request.model_settings._to_legacy_config_params() update_llm_config_params = request.model_settings._to_legacy_config_params()
# Don't clobber max_tokens with the Pydantic default when the caller
# didn't explicitly provide max_output_tokens in the request.
if "max_output_tokens" not in request.model_settings.model_fields_set:
update_llm_config_params.pop("max_tokens", None)
request.llm_config = request.llm_config.model_copy(update=update_llm_config_params) request.llm_config = request.llm_config.model_copy(update=update_llm_config_params)
# Copy parallel_tool_calls from request to llm_config if provided # Copy parallel_tool_calls from request to llm_config if provided