Commit Graph

6730 Commits

Author SHA1 Message Date
Caren Thomas
5646be7df6 chore: bump version 0.16.1 2025-12-17 17:33:43 -08:00
Ari Webb
79c2319403 chore: add project constraint on tools db (#7360) 2025-12-17 17:32:27 -08:00
Kevin Lin
33afb930fc fix: Google AI client logging as Vertex (#7337)
fix
2025-12-17 17:32:27 -08:00
jnjpng
5312129587 chore: add redis to oss docker (#7347)
* base

* fix
2025-12-17 17:32:25 -08:00
Sarah Wooders
ae4490c5b3 fix: filter out stop reason from response streaming (#7332) 2025-12-17 17:31:03 -08:00
Ari Webb
e47eb23522 fix: max output tokens for gemini 3 models (#7322) 2025-12-17 17:31:03 -08:00
Shubham Naik
25a7eaafe6 Shub/let 6588 feedback endpoint [LET-6588] (#7329)
* feat: add feedback endpoint

* feat: add feedback endpoint

---------

Co-authored-by: Shubham Naik <shub@memgpt.ai>
2025-12-17 17:31:03 -08:00
cthomas
2d82faad70 fix: skip agent relationship joins for source upload (#7335) 2025-12-17 17:31:03 -08:00
cthomas
7dfdbc5a80 feat: enable agent override for proxy (#7231) 2025-12-17 17:31:03 -08:00
Kian Jones
632bee868c chore: upgrade ddtrace version (#7326)
* upgrade ddtrace version

* uv lock
2025-12-17 17:31:03 -08:00
Shubham Naik
2f023e9f4a Shub/let 6581 add endpoint for showing current balances [LET-6581] (#7208)
* feat: add endpoint for showing current balance

* feat: add endpoint for showing current balance

* chore: update endpoints

---------

Co-authored-by: Shubham Naik <shub@memgpt.ai>
2025-12-17 17:31:02 -08:00
Christina Tong
f929d53cfe add msg id to search endpoint response [LET-6582] (#7236)
* add msg id to search endpoint response

* rename
2025-12-17 17:31:02 -08:00
Kian Jones
9c1c2698ab feat: enable Datadog LLM Observability for memgpt-server (#7258)
* fix: prevent empty reasoning messages in streaming interfaces

Prevents empty "Thinking..." indicators from appearing in clients by
filtering out reasoning messages with no content at the source.

Changes:
- Gemini: Don't emit ReasoningMessage when only thought_signature exists
- Gemini: Only emit reasoning content if text is non-empty
- Anthropic: Don't emit ReasoningMessage for BetaSignatureDelta
- Anthropic: Only emit reasoning content if thinking text is non-empty

This fixes the issue where providers send signature metadata before
actual thinking content, causing empty reasoning blocks to appear
in the UI after responses complete.

Affects: Gemini reasoning, Anthropic extended thinking

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* feat: enable Datadog LLM Observability for memgpt-server

Enables DD_LLMOBS to track LLM calls, prompts, completions, and costs
in production for memgpt-server.

Changes:
- Add DD_LLMOBS_ENABLED=1 and DD_LLMOBS_ML_APP=memgpt-server in:
  - .github/workflows/deploy-core.yml (GitHub Actions deployment)
  - justfile (Helm deployment secrets)
  - apps/core/letta/server/rest_api/app.py (runtime config)

This provides visibility into:
- LLM API calls and latency
- Prompt/completion content and tokens
- Model costs and usage
- Error rates per model/provider

👾 Generated with [Letta Code](https://letta.com)

Co-Authored-By: Letta <noreply@letta.com>

* dd llmobs

* Revert "fix: prevent empty reasoning messages in streaming interfaces"

This reverts commit a900228b3611de49eb5f740f68dc76a657fc9b14.

---------

Co-authored-by: Letta <noreply@letta.com>
2025-12-17 17:31:02 -08:00
jnjpng
350f3a751c fix: update more plaintext non async callsites (#7223)
* bae

* update

* fix

* clean up

* last
2025-12-17 17:31:02 -08:00
jnjpng
591420876a fix: correct decorator order for trace_method and raise_on_invalid_id (#7226)
Swap the order of @trace_method and @raise_on_invalid_id decorators
across all service managers so that @trace_method is always the first
wrapper applied to the function (positioned directly above the method).

This ensures the ID validation happens before tracing begins, which is
the intended execution order.

Files modified:
- agent_manager.py (23 occurrences)
- archive_manager.py (11 occurrences)
- block_manager.py (7 occurrences)
- file_manager.py (6 occurrences)
- group_manager.py (9 occurrences)
- identity_manager.py (10 occurrences)
- job_manager.py (7 occurrences)
- message_manager.py (2 occurrences)
- provider_manager.py (3 occurrences)
- sandbox_config_manager.py (7 occurrences)
- source_manager.py (5 occurrences)
- step_manager.py (13 occurrences)
2025-12-17 17:31:02 -08:00
cthomas
82de08be42 fix: agent not defined more callsites (#7219) 2025-12-17 17:31:02 -08:00
jnjpng
25d75d6528 fix: update aysnc get plaintext callsites (#7069)
* base

* resolve

* fix

* fix
2025-12-17 17:31:02 -08:00
cthomas
e6a4b3e874 fix: nonetype not iterable (#7216) 2025-12-17 17:31:02 -08:00
cthomas
6a2e61c3fa fix: agent not defined error (#7212) 2025-12-17 17:31:02 -08:00
cthomas
9bc872a350 fix: add gen tool error handling (#7209) 2025-12-17 17:31:02 -08:00
cthomas
6de4f0f437 feat: filter out policy system messages for proxy (#7205) 2025-12-17 17:31:02 -08:00
Sarah Wooders
8729a037b9 fix: handle new openai overflow error format (#7110) 2025-12-17 17:31:02 -08:00
Sarah Wooders
f1bd246e9b feat: use token streaming for anthropic summarization (#7105) 2025-12-17 17:31:02 -08:00
Kevin Lin
857139f907 feat: Set reasonable defaults for max output tokens [LET-6483] (#7084) 2025-12-17 17:31:02 -08:00
jnjpng
00ba2d09f3 refactor: migrate mcp_servers and mcp_oauth to encrypted-only columns (#6751)
* refactor: migrate mcp_servers and mcp_oauth to encrypted-only columns

Complete migration to encrypted-only storage for sensitive fields:

- Remove dual-write to plaintext columns (token, custom_headers,
  authorization_code, access_token, refresh_token, client_secret)
- Read only from _enc columns, not from plaintext fallback
- Remove helper methods (get_token_secret, set_token_secret, etc.)
- Remove Secret.from_db() and Secret.to_dict() methods
- Update tests to verify encrypted-only behavior

After this change, plaintext columns can be set to NULL manually
since they are no longer read from or written to.

* fix test

* rename

* update

* union

* fix test
2025-12-17 17:31:02 -08:00
Kevin Lin
03a41f8e8d chore: Increase LLM streaming timeout [LET-6562] (#7080)
increase
2025-12-17 17:31:02 -08:00
Ari Webb
4878b49fa1 chore: bounds check for assistant message index (#7070) 2025-12-17 17:31:02 -08:00
Sooty
6f48d4bd48 Correct provider name for openai-proxy in LLMConfig (#3097) 2025-12-16 19:37:54 -08:00
cthomas
be53f15ce0 chore: bump v0.16.0 (#3095) 2025-12-15 12:12:23 -08:00
Caren Thomas
c99ff56abc chore: bump version v0.16.0 2025-12-15 12:04:32 -08:00
Sarah Wooders
bd9f3aca9b fix: fix prompt_acknowledgement usage and update summarization prompts (#7012) 2025-12-15 12:03:09 -08:00
Sarah Wooders
812bfd16dd Revert "feat: project_id uniqueness for tools" (#7007)
Revert "feat: project_id uniqueness for tools (#6604)"

This reverts commit 2c4b6397041e2c965493525fc52e056f10d1bdb6.
2025-12-15 12:03:09 -08:00
Sarah Wooders
0c0ba5d03d fix: remove letta-free embeddings from testing (#6870) 2025-12-15 12:03:09 -08:00
Charles Packer
33d39f4643 fix(core): patch usage data tracking for anthropic when context caching is on (#6997) 2025-12-15 12:03:09 -08:00
Sarah Wooders
a731e01e88 fix: use model instead of model_settings (#6834) 2025-12-15 12:03:09 -08:00
Sarah Wooders
a721a00899 feat: add agent_id to search results (#6867) 2025-12-15 12:03:09 -08:00
Kevin Lin
4b9485a484 feat: Add max tokens exceeded to stop reasons [LET-6480] (#6576) 2025-12-15 12:03:09 -08:00
cthomas
efac48e9ea feat: add zai proxy LET-6543 (#6836)
feat: add zai proxy
2025-12-15 12:03:09 -08:00
Kian Jones
bce1749408 fix: run PBKDF2 in thread pool to prevent event loop freeze (#6763)
* fix: run PBKDF2 in thread pool to prevent event loop freeze

Problem: Event loop freezes for 100-500ms during secret decryption, blocking
all HTTP requests and async operations. The diagnostic monitor detected the
main thread stuck in PBKDF2 HMAC SHA256 computation at:
  apps/core/letta/helpers/crypto_utils.py:51 (_derive_key)
  apps/core/letta/schemas/secret.py:161 (get_plaintext)

Root cause: PBKDF2 with 100k iterations is intentionally CPU-intensive for
security, but running it synchronously on the main thread blocks the event loop.

Stack trace showed:
  Thread 1 (Main): PBKDF2HMAC -> SHA256_Final -> sha256_block_data_order_avx2
  Event loop watchdog: Detected freeze at 01:11:44 (request started 01:12:03)

Solution:
1. Run PBKDF2 in ThreadPoolExecutor to avoid blocking event loop
2. Add async versions of encrypt/decrypt methods
3. Add LRU cache for derived keys (deterministic results)
4. Add async get_plaintext_async() method to Secret class

Changes:
- apps/core/letta/helpers/crypto_utils.py:
  - Added ThreadPoolExecutor for crypto operations
  - Added @lru_cache(maxsize=256) to _derive_key_cached()
  - Added _derive_key_async() using loop.run_in_executor()
  - Added encrypt_async() and decrypt_async() methods
  - Added warnings to sync methods about blocking behavior

- apps/core/letta/schemas/secret.py:
  - Added get_plaintext_async() method
  - Added warnings to get_plaintext() about blocking behavior

Benefits:
- Event loop no longer freezes during secret decryption
- HTTP requests continue processing while crypto runs in background
- Derived keys are cached, reducing CPU usage for repeated operations
- Backward compatible - sync methods still work for non-async code

Performance impact:
- Before: 100-500ms event loop block per decryption
- After: 100-500ms in thread pool (non-blocking) + LRU cache hits ~0.1ms

Next steps (follow-up PRs):
- Migrate all async callsites to use get_plaintext_async()
- Add metrics to track sync vs async usage
- Consider reducing PBKDF2 iterations if security allows

* update

* test

---------

Co-authored-by: Letta Bot <jinjpeng@gmail.com>
2025-12-15 12:03:09 -08:00
Ari Webb
c1aa01db6f feat: project_id uniqueness for tools (#6604)
* feat: project_id uniqueness for tools

* prevent double upsert of global tools

* use default project if no header for sdk

* reorder unique constraint for performance

* use separate session for check conflict

* feature flag adding project id header in cloud api

* add my migration after one on main

* remove comment

* stage and publish api

* web set project id just for tools

* includes instead of startswith
2025-12-15 12:03:09 -08:00
cthomas
22b9ed254a feat: skip persisting redundant messages for proxy (#6819) 2025-12-15 12:03:09 -08:00
Sarah Wooders
0634aa13a1 fix: avoid holding sessions open (#6769) 2025-12-15 12:03:09 -08:00
Sarah Wooders
c9ad2fd7c4 chore: move things to debug logging (#6610) 2025-12-15 12:03:09 -08:00
Ari Webb
fecf503ad9 feat: xhigh reasoning for gpt-5.2 (#6735) 2025-12-15 12:03:09 -08:00
cthomas
bffb9064b8 fix: step logging error (#6755) 2025-12-15 12:03:08 -08:00
cthomas
fd8e471b2e chore: improve logging for proxy (#6754) 2025-12-15 12:03:08 -08:00
cthomas
2dac75a223 fix: remove project id before proxying (#6750) 2025-12-15 12:03:08 -08:00
jnjpng
4be813b956 fix: migrate sandbox and agent environment variables to encrypted only (#6623)
* base

* remove unnnecessary db migration

* update

* fix

* update

* update

* comments

* fix

* revert

* anotha

---------

Co-authored-by: Letta Bot <noreply@letta.com>
2025-12-15 12:03:08 -08:00
cthomas
799ddc9fe8 chore: api sync (#6747) 2025-12-15 12:03:07 -08:00
cthomas
b3561631da feat: create agents with default project for proxy [LET-6488] (#6716)
* feat: create agents with default project for proxy

* make change less invasive
2025-12-15 12:02:53 -08:00