chore: officially migrate to submodule (#4502)

* remove apps/core and apps/fern

* fix precommit

* add submodule updates in workflows

* submodule

* remove core tests

* update core revision

* Add submodules: true to all GitHub workflows

- Ensure all workflows can access git submodules
- Add submodules support to deployment, test, and CI workflows
- Fix YAML syntax issues in workflow files

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

* remove core-lint

* upgrade core with latest main of oss

---------

Co-authored-by: Claude <noreply@anthropic.com>
This commit is contained in:
Kian Jones
2025-09-09 12:45:53 -07:00
committed by GitHub
parent 48b5722095
commit 22f70ca07c
953 changed files with 0 additions and 181472 deletions

View File

@@ -1,44 +0,0 @@
---
name: Bug report
about: Create a report to help us improve
title: ''
labels: ''
assignees: ''
---
**Describe the bug**
A clear and concise description of what the bug is.
**Please describe your setup**
- [ ] How are you running Letta?
- Docker
- pip (legacy)
- From source
- Desktop
- [ ] Describe your setup
- What's your OS (Windows/MacOS/Linux)?
- What is your `docker run ...` command (if applicable)
**Screenshots**
If applicable, add screenshots to help explain your problem.
**Additional context**
Add any other context about the problem here.
- What model you are using
**Agent File (optional)**
Please attach your `.af` file, as this helps with reproducing issues.
---
If you're not using OpenAI, please provide additional information on your local LLM setup:
**Local LLM details**
If you are trying to run Letta with local LLMs, please provide the following information:
- [ ] The exact model you're trying to use (e.g. `dolphin-2.1-mistral-7b.Q6_K.gguf`)
- [ ] The local LLM backend you are using (web UI? LM Studio?)
- [ ] Your hardware for the local LLM backend (local computer? operating system? remote RunPod?)

View File

@@ -1,20 +0,0 @@
---
name: Feature request
about: Suggest an idea for this project
title: ''
labels: ''
assignees: ''
---
**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
**Describe the solution you'd like**
A clear and concise description of what you want to happen.
**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.
**Additional context**
Add any other context or screenshots about the feature request here.

View File

@@ -1,17 +0,0 @@
**Please describe the purpose of this pull request.**
Is it to add a new feature? Is it to fix a bug?
**How to test**
How can we test your PR during review? What commands should we run? What outcomes should we expect?
**Have you tested this PR?**
Have you tested the latest commit on the PR? If so please provide outputs from your tests.
**Related issues or PRs**
Please link any related GitHub [issues](https://github.com/letta-ai/letta/issues) or [PRs](https://github.com/letta-ai/letta/pulls).
**Is your PR over 500 lines of code?**
If so, please break up your PR into multiple smaller PRs so that we can review them quickly, or provide justification for its length.
**Additional context**
Add any other context or screenshots about the PR here.

View File

@@ -1,286 +0,0 @@
import logging
import os
import socket
import threading
import time
from datetime import datetime, timezone
from typing import Generator
import pytest
import requests
from anthropic.types.beta.messages import BetaMessageBatch, BetaMessageBatchRequestCounts
from dotenv import load_dotenv
from letta_client import AsyncLetta, Letta
from letta.schemas.agent import AgentState
from letta.schemas.llm_config import LLMConfig
from letta.services.organization_manager import OrganizationManager
from letta.services.user_manager import UserManager
from letta.settings import tool_settings
def pytest_configure(config):
logging.basicConfig(level=logging.DEBUG)
@pytest.fixture
def disable_e2b_api_key() -> Generator[None, None, None]:
"""
Temporarily disables the E2B API key by setting `tool_settings.e2b_api_key` to None
for the duration of the test. Restores the original value afterward.
"""
from letta.settings import tool_settings
original_api_key = tool_settings.e2b_api_key
tool_settings.e2b_api_key = None
yield
tool_settings.e2b_api_key = original_api_key
@pytest.fixture
def check_e2b_key_is_set():
from letta.settings import tool_settings
original_api_key = tool_settings.e2b_api_key
assert original_api_key is not None, "Missing e2b key! Cannot execute these tests."
yield
@pytest.fixture
def default_organization():
"""Fixture to create and return the default organization."""
manager = OrganizationManager()
org = manager.create_default_organization()
yield org
@pytest.fixture
def default_user(default_organization):
"""Fixture to create and return the default user within the default organization."""
manager = UserManager()
user = manager.create_default_user(org_id=default_organization.id)
yield user
@pytest.fixture
def check_composio_key_set():
original_api_key = tool_settings.composio_api_key
assert original_api_key is not None, "Missing composio key! Cannot execute this test."
yield
# --- Tool Fixtures ---
@pytest.fixture
def weather_tool_func():
def get_weather(location: str) -> str:
"""
Fetches the current weather for a given location.
Parameters:
location (str): The location to get the weather for.
Returns:
str: A formatted string describing the weather in the given location.
Raises:
RuntimeError: If the request to fetch weather data fails.
"""
import requests
url = f"https://wttr.in/{location}?format=%C+%t"
response = requests.get(url)
if response.status_code == 200:
weather_data = response.text
return f"The weather in {location} is {weather_data}."
else:
raise RuntimeError(f"Failed to get weather data, status code: {response.status_code}")
yield get_weather
@pytest.fixture
def print_tool_func():
"""Fixture to create a tool with default settings and clean up after the test."""
def print_tool(message: str):
"""
Args:
message (str): The message to print.
Returns:
str: The message that was printed.
"""
print(message)
return message
yield print_tool
@pytest.fixture
def roll_dice_tool_func():
def roll_dice():
"""
Rolls a 6 sided die.
Returns:
str: The roll result.
"""
import time
time.sleep(1)
return "Rolled a 10!"
yield roll_dice
@pytest.fixture
def dummy_beta_message_batch() -> BetaMessageBatch:
return BetaMessageBatch(
id="msgbatch_013Zva2CMHLNnXjNJJKqJ2EF",
archived_at=datetime(2024, 8, 20, 18, 37, 24, 100435, tzinfo=timezone.utc),
cancel_initiated_at=datetime(2024, 8, 20, 18, 37, 24, 100435, tzinfo=timezone.utc),
created_at=datetime(2024, 8, 20, 18, 37, 24, 100435, tzinfo=timezone.utc),
ended_at=datetime(2024, 8, 20, 18, 37, 24, 100435, tzinfo=timezone.utc),
expires_at=datetime(2024, 8, 20, 18, 37, 24, 100435, tzinfo=timezone.utc),
processing_status="in_progress",
request_counts=BetaMessageBatchRequestCounts(
canceled=10,
errored=30,
expired=10,
processing=100,
succeeded=50,
),
results_url="https://api.anthropic.com/v1/messages/batches/msgbatch_013Zva2CMHLNnXjNJJKqJ2EF/results",
type="message_batch",
)
# --- Model Sweep ---
# Global flag to track server state
_server_started = False
_server_url = None
def _start_server_once() -> str:
"""Start server exactly once, return URL"""
global _server_started, _server_url
if _server_started and _server_url:
return _server_url
url = os.getenv("LETTA_SERVER_URL", "http://localhost:8283")
# Check if already running
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
if s.connect_ex(("localhost", 8283)) == 0:
_server_started = True
_server_url = url
return url
# Start server (your existing logic)
if not os.getenv("LETTA_SERVER_URL"):
def _run_server():
load_dotenv()
from letta.server.rest_api.app import start_server
start_server(debug=True)
thread = threading.Thread(target=_run_server, daemon=True)
thread.start()
# Poll until up
timeout_seconds = 30
deadline = time.time() + timeout_seconds
while time.time() < deadline:
try:
resp = requests.get(url + "/v1/health")
if resp.status_code < 500:
break
except requests.exceptions.RequestException:
pass
time.sleep(0.1)
else:
raise RuntimeError(f"Could not reach {url} within {timeout_seconds}s")
_server_started = True
_server_url = url
return url
# ------------------------------
# Fixtures
# ------------------------------
@pytest.fixture(scope="module")
def server_url() -> str:
"""Return URL of already-started server"""
return _start_server_once()
@pytest.fixture(scope="module")
def client(server_url: str) -> Letta:
"""
Creates and returns a synchronous Letta REST client for testing.
"""
client_instance = Letta(base_url=server_url)
yield client_instance
@pytest.fixture(scope="function")
def async_client(server_url: str) -> AsyncLetta:
"""
Creates and returns an asynchronous Letta REST client for testing.
"""
async_client_instance = AsyncLetta(base_url=server_url)
yield async_client_instance
@pytest.fixture(scope="module")
def agent_state(client: Letta) -> AgentState:
"""
Creates and returns an agent state for testing with a pre-configured agent.
The agent is named 'supervisor' and is configured with base tools and the roll_dice tool.
"""
client.tools.upsert_base_tools()
send_message_tool = client.tools.list(name="send_message")[0]
agent_state_instance = client.agents.create(
name="supervisor",
include_base_tools=False,
tool_ids=[send_message_tool.id],
model="openai/gpt-4o",
embedding="letta/letta-free",
tags=["supervisor"],
)
yield agent_state_instance
client.agents.delete(agent_state_instance.id)
@pytest.fixture(scope="module")
def all_available_llm_configs(client: Letta) -> [LLMConfig]:
"""
Returns a list of all available LLM configs.
"""
llm_configs = client.models.list()
return llm_configs
# create a client to the started server started at
def get_available_llm_configs() -> [LLMConfig]:
"""Get configs, starting server if needed"""
server_url = _start_server_once()
temp_client = Letta(base_url=server_url)
return temp_client.models.list()
# dynamically insert llm_config paramter at collection time
def pytest_generate_tests(metafunc):
"""Dynamically parametrize tests that need llm_config."""
if "llm_config" in metafunc.fixturenames:
configs = get_available_llm_configs()
if configs:
metafunc.parametrize("llm_config", configs, ids=[c.model for c in configs])

View File

@@ -1,21 +0,0 @@
{
"Basic": [
"test_greeting_with_assistant_message",
"test_greeting_without_assistant_message",
"test_async_greeting_with_assistant_message",
"test_agent_loop_error",
"test_step_stream_agent_loop_error",
"test_step_streaming_greeting_with_assistant_message",
"test_step_streaming_greeting_without_assistant_message",
"test_step_streaming_tool_call",
"test_tool_call",
"test_auto_summarize"
],
"Token Streaming": [
"test_token_streaming_greeting_with_assistant_message",
"test_token_streaming_greeting_without_assistant_message",
"test_token_streaming_agent_loop_error",
"test_token_streaming_tool_call"
],
"Multimodal": ["test_base64_image_input", "test_url_image_input"]
}

View File

@@ -1,495 +0,0 @@
#!/usr/bin/env python3
import json
import os
import sys
from collections import defaultdict
from datetime import datetime
def load_feature_mappings(config_file=None):
"""Load feature mappings from config file."""
if config_file is None:
# Default to feature_mappings.json in the same directory as this script
script_dir = os.path.dirname(os.path.abspath(__file__))
config_file = os.path.join(script_dir, "feature_mappings.json")
try:
with open(config_file, "r") as f:
return json.load(f)
except FileNotFoundError:
print(f"Error: Could not find feature mappings config file '{config_file}'")
sys.exit(1)
except json.JSONDecodeError:
print(f"Error: Invalid JSON in feature mappings config file '{config_file}'")
sys.exit(1)
def get_support_status(passed_tests, feature_tests):
"""Determine support status for a feature category."""
if not feature_tests:
return "" # Unknown - no tests for this feature
# Filter out error tests when checking for support
non_error_tests = [test for test in feature_tests if not test.endswith("_error")]
error_tests = [test for test in feature_tests if test.endswith("_error")]
# Check which non-error tests passed
passed_non_error_tests = [test for test in non_error_tests if test in passed_tests]
# If there are no non-error tests, only error tests, treat as unknown
if not non_error_tests:
return "" # Only error tests available
# Support is based only on non-error tests
if len(passed_non_error_tests) == len(non_error_tests):
return "" # Full support
elif len(passed_non_error_tests) == 0:
return "" # No support
else:
return "⚠️" # Partial support
def categorize_tests(all_test_names, feature_mapping):
"""Categorize test names into feature buckets."""
categorized = {feature: [] for feature in feature_mapping.keys()}
for test_name in all_test_names:
for feature, test_patterns in feature_mapping.items():
if test_name in test_patterns:
categorized[feature].append(test_name)
break
return categorized
def calculate_support_score(feature_support, feature_order):
"""Calculate a numeric support score for ranking models.
For partial support, the score is weighted by the position of the feature
in the feature_order list (earlier features get higher weight).
"""
score = 0
max_features = len(feature_order)
for feature, status in feature_support.items():
# Get position weight (earlier features get higher weight)
if feature in feature_order:
position_weight = (max_features - feature_order.index(feature)) / max_features
else:
position_weight = 0.5 # Default weight for unmapped features
if status == "": # Full support
score += 10 * position_weight
elif status == "⚠️": # Partial support - weighted by column position
score += 5 * position_weight
elif status == "": # No support
score += 1 * position_weight
# Unknown (❓) gets 0 points
return score
def calculate_provider_support_score(models_data, feature_order):
"""Calculate a provider-level support score based on all models' support scores."""
if not models_data:
return 0
# Calculate the average support score across all models in the provider
total_score = sum(model["support_score"] for model in models_data)
return total_score / len(models_data)
def get_test_function_line_numbers(test_file_path):
"""Extract line numbers for test functions from the test file."""
test_line_numbers = {}
try:
with open(test_file_path, "r") as f:
lines = f.readlines()
for i, line in enumerate(lines, 1):
if "def test_" in line and line.strip().startswith("def test_"):
# Extract function name
func_name = line.strip().split("def ")[1].split("(")[0]
test_line_numbers[func_name] = i
except FileNotFoundError:
print(f"Warning: Could not find test file at {test_file_path}")
return test_line_numbers
def get_github_repo_info():
"""Get GitHub repository information from git remote."""
try:
# Try to get the GitHub repo URL from git remote
import subprocess
result = subprocess.run(["git", "remote", "get-url", "origin"], capture_output=True, text=True, cwd=os.path.dirname(__file__))
if result.returncode == 0:
remote_url = result.stdout.strip()
# Parse GitHub URL
if "github.com" in remote_url:
if remote_url.startswith("https://"):
# https://github.com/user/repo.git -> user/repo
repo_path = remote_url.replace("https://github.com/", "").replace(".git", "")
elif remote_url.startswith("git@"):
# git@github.com:user/repo.git -> user/repo
repo_path = remote_url.split(":")[1].replace(".git", "")
else:
return None
return repo_path
except:
pass
# Default fallback
return "letta-ai/letta"
def generate_test_details(model_info, feature_mapping):
"""Generate detailed test results for a model."""
details = []
# Get test function line numbers
script_dir = os.path.dirname(os.path.abspath(__file__))
test_file_path = os.path.join(script_dir, "model_sweep.py")
test_line_numbers = get_test_function_line_numbers(test_file_path)
# Use the main branch GitHub URL
base_github_url = "https://github.com/letta-ai/letta/blob/main/.github/scripts/model-sweep/model_sweep.py"
for feature, tests in model_info["categorized_tests"].items():
if not tests:
continue
details.append(f"### {feature}")
details.append("")
for test in sorted(tests):
if test in model_info["passed_tests"]:
status = ""
elif test in model_info["failed_tests"]:
status = ""
else:
status = ""
# Create GitHub link if we have line number info
if test in test_line_numbers:
line_num = test_line_numbers[test]
github_link = f"{base_github_url}#L{line_num}"
details.append(f"- {status} [`{test}`]({github_link})")
else:
details.append(f"- {status} `{test}`")
details.append("")
return details
def calculate_column_widths(all_provider_data, feature_mapping):
"""Calculate the maximum width needed for each column across all providers."""
widths = {"model": len("Model"), "context_window": len("Context Window"), "last_scanned": len("Last Scanned")}
# Feature column widths
for feature in feature_mapping.keys():
widths[feature] = len(feature)
# Check all model data for maximum widths
for provider_data in all_provider_data.values():
for model_info in provider_data:
# Model name width (including backticks)
model_width = len(f"`{model_info['name']}`")
widths["model"] = max(widths["model"], model_width)
# Context window width (with commas)
context_width = len(f"{model_info['context_window']:,}")
widths["context_window"] = max(widths["context_window"], context_width)
# Last scanned width
widths["last_scanned"] = max(widths["last_scanned"], len(str(model_info["last_scanned"])))
# Feature support symbols are always 2 chars, so no need to check
return widths
def process_model_sweep_report(input_file, output_file, config_file=None, debug=False):
"""Convert model sweep JSON data to MDX report."""
# Load feature mappings from config file
feature_mapping = load_feature_mappings(config_file)
# if debug:
# print("DEBUG: Feature mappings loaded:")
# for feature, tests in feature_mapping.items():
# print(f" {feature}: {tests}")
# print()
# Read the JSON data
with open(input_file, "r") as f:
data = json.load(f)
tests = data.get("tests", [])
# if debug:
# print("DEBUG: Tests loaded:")
# print([test['outcome'] for test in tests if 'haiku' in test['nodeid']])
# Calculate summary statistics
providers = set(test["metadata"]["llm_config"]["provider_name"] for test in tests)
models = set(test["metadata"]["llm_config"]["model"] for test in tests)
total_tests = len(tests)
# Start building the MDX
mdx_lines = [
"---",
"title: Support Models",
f"generated: {datetime.now().isoformat()}",
"---",
"",
"# Supported Models",
"",
"## Overview",
"",
"Letta routinely runs automated scans against available providers and models. These are the results of the latest scan.",
"",
f"Ran {total_tests} tests against {len(models)} models across {len(providers)} providers on {datetime.now().strftime('%B %dth, %Y')}",
"",
"",
]
# Group tests by provider
provider_groups = defaultdict(list)
for test in tests:
provider_name = test["metadata"]["llm_config"]["provider_name"]
provider_groups[provider_name].append(test)
# Process all providers first to collect model data
all_provider_data = {}
provider_support_scores = {}
for provider_name in provider_groups.keys():
provider_tests = provider_groups[provider_name]
# Group tests by model within this provider
model_groups = defaultdict(list)
for test in provider_tests:
model_name = test["metadata"]["llm_config"]["model"]
model_groups[model_name].append(test)
# Process all models to calculate support scores for ranking
model_data = []
for model_name in model_groups.keys():
model_tests = model_groups[model_name]
# if debug:
# print(f"DEBUG: Processing model '{model_name}' in provider '{provider_name}'")
# Extract unique test names for passed and failed tests
passed_tests = set()
failed_tests = set()
all_test_names = set()
for test in model_tests:
# Extract test name from nodeid (split on :: and [)
test_name = test["nodeid"].split("::")[1].split("[")[0]
all_test_names.add(test_name)
# if debug:
# print(f" Test name: {test_name}")
# print(f" Outcome: {test}")
if test["outcome"] == "passed":
passed_tests.add(test_name)
elif test["outcome"] == "failed":
failed_tests.add(test_name)
# if debug:
# print(f" All test names found: {sorted(all_test_names)}")
# print(f" Passed tests: {sorted(passed_tests)}")
# print(f" Failed tests: {sorted(failed_tests)}")
# Categorize tests into features
categorized_tests = categorize_tests(all_test_names, feature_mapping)
# if debug:
# print(f" Categorized tests:")
# for feature, tests in categorized_tests.items():
# print(f" {feature}: {tests}")
# Determine support status for each feature
feature_support = {}
for feature_name in feature_mapping.keys():
feature_support[feature_name] = get_support_status(passed_tests, categorized_tests[feature_name])
# if debug:
# print(f" Feature support:")
# for feature, status in feature_support.items():
# print(f" {feature}: {status}")
# print()
# Get context window and last scanned time
context_window = model_tests[0]["metadata"]["llm_config"]["context_window"]
# Try to get time_last_scanned from metadata, fallback to current time
try:
last_scanned = model_tests[0]["metadata"].get(
"time_last_scanned", model_tests[0]["metadata"].get("timestamp", datetime.now().isoformat())
)
# Format timestamp if it's a full ISO string
if "T" in str(last_scanned):
last_scanned = str(last_scanned).split("T")[0] # Just the date part
except:
last_scanned = "Unknown"
# Calculate support score for ranking
feature_order = list(feature_mapping.keys())
support_score = calculate_support_score(feature_support, feature_order)
# Store model data for sorting
model_data.append(
{
"name": model_name,
"feature_support": feature_support,
"context_window": context_window,
"last_scanned": last_scanned,
"support_score": support_score,
"failed_tests": failed_tests,
"passed_tests": passed_tests,
"categorized_tests": categorized_tests,
}
)
# Sort models by support score (descending) then by name (ascending)
model_data.sort(key=lambda x: (-x["support_score"], x["name"]))
# Store provider data
all_provider_data[provider_name] = model_data
provider_support_scores[provider_name] = calculate_provider_support_score(model_data, list(feature_mapping.keys()))
# Calculate column widths for consistent formatting (add details column)
column_widths = calculate_column_widths(all_provider_data, feature_mapping)
column_widths["details"] = len("Details")
# Sort providers by support score (descending) then by name (ascending)
sorted_providers = sorted(provider_support_scores.keys(), key=lambda x: (-provider_support_scores[x], x))
# Generate tables for all providers first
for provider_name in sorted_providers:
model_data = all_provider_data[provider_name]
support_score = provider_support_scores[provider_name]
# Create dynamic headers with proper padding and centering
feature_names = list(feature_mapping.keys())
# Build header row with left-aligned first column, centered others
header_parts = [f"{'Model':<{column_widths['model']}}"]
for feature in feature_names:
header_parts.append(f"{feature:^{column_widths[feature]}}")
header_parts.extend(
[
f"{'Context Window':^{column_widths['context_window']}}",
f"{'Last Scanned':^{column_widths['last_scanned']}}",
f"{'Details':^{column_widths['details']}}",
]
)
header_row = "| " + " | ".join(header_parts) + " |"
# Build separator row with left-aligned first column, centered others
separator_parts = [f"{'-' * column_widths['model']}"]
for feature in feature_names:
separator_parts.append(f":{'-' * (column_widths[feature] - 2)}:")
separator_parts.extend(
[
f":{'-' * (column_widths['context_window'] - 2)}:",
f":{'-' * (column_widths['last_scanned'] - 2)}:",
f":{'-' * (column_widths['details'] - 2)}:",
]
)
separator_row = "|" + "|".join(separator_parts) + "|"
# Add provider section without percentage
mdx_lines.extend([f"## {provider_name}", "", header_row, separator_row])
# Generate table rows for sorted models with proper padding
for model_info in model_data:
# Create anchor for model details
model_anchor = model_info["name"].replace("/", "_").replace(":", "_").replace("-", "_").lower()
details_anchor = f"{provider_name.lower().replace(' ', '_')}_{model_anchor}_details"
# Build row with left-aligned first column, centered others
row_parts = [f"`{model_info['name']}`".ljust(column_widths["model"])]
for feature in feature_names:
row_parts.append(f"{model_info['feature_support'][feature]:^{column_widths[feature]}}")
row_parts.extend(
[
f"{model_info['context_window']:,}".center(column_widths["context_window"]),
f"{model_info['last_scanned']}".center(column_widths["last_scanned"]),
f"[View](#{details_anchor})".center(column_widths["details"]),
]
)
row = "| " + " | ".join(row_parts) + " |"
mdx_lines.append(row)
# Add spacing between provider tables
mdx_lines.extend(["", ""])
# Add detailed test results section after all tables
mdx_lines.extend(["---", "", "# Detailed Test Results", ""])
for provider_name in sorted_providers:
model_data = all_provider_data[provider_name]
mdx_lines.extend([f"## {provider_name}", ""])
for model_info in model_data:
model_anchor = model_info["name"].replace("/", "_").replace(":", "_").replace("-", "_").lower()
details_anchor = f"{provider_name.lower().replace(' ', '_')}_{model_anchor}_details"
mdx_lines.append(f'<a id="{details_anchor}"></a>')
mdx_lines.append(f"### {model_info['name']}")
mdx_lines.append("")
# Add test details
test_details = generate_test_details(model_info, feature_mapping)
mdx_lines.extend(test_details)
# Add spacing between providers in details section
mdx_lines.extend(["", ""])
# Write the MDX file
with open(output_file, "w") as f:
f.write("\n".join(mdx_lines))
print(f"Model sweep report saved to {output_file}")
def main():
input_file = "model_sweep_report.json"
output_file = "model_sweep_report.mdx"
config_file = None
debug = False
# Allow command line arguments
if len(sys.argv) > 1:
# Use the file located in the same directory as this script
script_dir = os.path.dirname(os.path.abspath(__file__))
input_file = os.path.join(script_dir, sys.argv[1])
if len(sys.argv) > 2:
# Use the file located in the same directory as this script
script_dir = os.path.dirname(os.path.abspath(__file__))
output_file = os.path.join(script_dir, sys.argv[2])
if len(sys.argv) > 3:
config_file = sys.argv[3]
if len(sys.argv) > 4 and sys.argv[4] == "--debug":
debug = True
try:
process_model_sweep_report(input_file, output_file, config_file, debug)
except FileNotFoundError:
print(f"Error: Could not find input file '{input_file}'")
sys.exit(1)
except json.JSONDecodeError:
print(f"Error: Invalid JSON in file '{input_file}'")
sys.exit(1)
except Exception as e:
print(f"Error: {e}")
sys.exit(1)
if __name__ == "__main__":
main()

View File

@@ -1,786 +0,0 @@
import base64
import json
import os
import socket
import threading
import time
import uuid
from typing import Any, Dict, List
import httpx
import pytest
import requests
from dotenv import load_dotenv
from letta_client import Letta, MessageCreate, Run
from letta_client.core.api_error import ApiError
from letta_client.types import (
AssistantMessage,
Base64Image,
ImageContent,
LettaUsageStatistics,
ReasoningMessage,
TextContent,
ToolCallMessage,
ToolReturnMessage,
UrlImage,
UserMessage,
)
from letta.schemas.agent import AgentState
from letta.schemas.llm_config import LLMConfig
# ------------------------------
# Helper Functions and Constants
# ------------------------------
def get_llm_config(filename: str, llm_config_dir: str = "tests/configs/llm_model_configs") -> LLMConfig:
filename = os.path.join(llm_config_dir, filename)
config_data = json.load(open(filename, "r"))
llm_config = LLMConfig(**config_data)
return llm_config
def roll_dice(num_sides: int) -> int:
"""
Returns a random number between 1 and num_sides.
Args:
num_sides (int): The number of sides on the die.
Returns:
int: A random integer between 1 and num_sides, representing the die roll.
"""
import random
return random.randint(1, num_sides)
USER_MESSAGE_OTID = str(uuid.uuid4())
USER_MESSAGE_RESPONSE: str = "Teamwork makes the dream work"
USER_MESSAGE_FORCE_REPLY: List[MessageCreate] = [
MessageCreate(
role="user",
content=f"This is an automated test message. Call the send_message tool with the message '{USER_MESSAGE_RESPONSE}'.",
otid=USER_MESSAGE_OTID,
)
]
USER_MESSAGE_ROLL_DICE: List[MessageCreate] = [
MessageCreate(
role="user",
content="This is an automated test message. Call the roll_dice tool with 16 sides and tell me the outcome.",
otid=USER_MESSAGE_OTID,
)
]
URL_IMAGE = "https://upload.wikimedia.org/wikipedia/commons/a/a7/Camponotus_flavomarginatus_ant.jpg"
USER_MESSAGE_URL_IMAGE: List[MessageCreate] = [
MessageCreate(
role="user",
content=[
ImageContent(source=UrlImage(url=URL_IMAGE)),
TextContent(text="What is in this image?"),
],
otid=USER_MESSAGE_OTID,
)
]
BASE64_IMAGE = base64.standard_b64encode(httpx.get(URL_IMAGE).content).decode("utf-8")
USER_MESSAGE_BASE64_IMAGE: List[MessageCreate] = [
MessageCreate(
role="user",
content=[
ImageContent(source=Base64Image(data=BASE64_IMAGE, media_type="image/jpeg")),
TextContent(text="What is in this image?"),
],
otid=USER_MESSAGE_OTID,
)
]
all_configs = [
"openai-gpt-4o-mini.json",
# "azure-gpt-4o-mini.json", # TODO: Re-enable on new agent loop
"claude-3-5-sonnet.json",
"claude-4-sonnet-extended.json",
"claude-3-7-sonnet-extended.json",
"gemini-1.5-pro.json",
"gemini-2.5-flash-vertex.json",
"gemini-2.5-pro-vertex.json",
"together-qwen-2.5-72b-instruct.json",
"ollama.json",
]
requested = os.getenv("LLM_CONFIG_FILE")
filenames = [requested] if requested else all_configs
TESTED_LLM_CONFIGS: List[LLMConfig] = [get_llm_config(fn) for fn in filenames]
def assert_greeting_with_assistant_message_response(
messages: List[Any],
streaming: bool = False,
token_streaming: bool = False,
from_db: bool = False,
) -> None:
"""
Asserts that the messages list follows the expected sequence:
ReasoningMessage -> AssistantMessage.
"""
expected_message_count = 3 if streaming or from_db else 2
assert len(messages) == expected_message_count
index = 0
if from_db:
assert isinstance(messages[index], UserMessage)
assert messages[index].otid == USER_MESSAGE_OTID
index += 1
# Agent Step 1
assert isinstance(messages[index], ReasoningMessage)
assert messages[index].otid and messages[index].otid[-1] == "0"
index += 1
assert isinstance(messages[index], AssistantMessage)
if not token_streaming:
assert USER_MESSAGE_RESPONSE in messages[index].content
assert messages[index].otid and messages[index].otid[-1] == "1"
index += 1
if streaming:
assert isinstance(messages[index], LettaUsageStatistics)
assert messages[index].prompt_tokens > 0
assert messages[index].completion_tokens > 0
assert messages[index].total_tokens > 0
assert messages[index].step_count > 0
def assert_greeting_without_assistant_message_response(
messages: List[Any],
streaming: bool = False,
token_streaming: bool = False,
from_db: bool = False,
) -> None:
"""
Asserts that the messages list follows the expected sequence:
ReasoningMessage -> ToolCallMessage -> ToolReturnMessage.
"""
expected_message_count = 4 if streaming or from_db else 3
assert len(messages) == expected_message_count
index = 0
if from_db:
assert isinstance(messages[index], UserMessage)
assert messages[index].otid == USER_MESSAGE_OTID
index += 1
# Agent Step 1
assert isinstance(messages[index], ReasoningMessage)
assert messages[index].otid and messages[index].otid[-1] == "0"
index += 1
assert isinstance(messages[index], ToolCallMessage)
assert messages[index].tool_call.name == "send_message"
if not token_streaming:
assert USER_MESSAGE_RESPONSE in messages[index].tool_call.arguments
assert messages[index].otid and messages[index].otid[-1] == "1"
index += 1
# Agent Step 2
assert isinstance(messages[index], ToolReturnMessage)
assert messages[index].otid and messages[index].otid[-1] == "0"
index += 1
if streaming:
assert isinstance(messages[index], LettaUsageStatistics)
def assert_tool_call_response(
messages: List[Any],
streaming: bool = False,
from_db: bool = False,
) -> None:
"""
Asserts that the messages list follows the expected sequence:
ReasoningMessage -> ToolCallMessage -> ToolReturnMessage ->
ReasoningMessage -> AssistantMessage.
"""
expected_message_count = 6 if streaming else 7 if from_db else 5
assert len(messages) == expected_message_count
index = 0
if from_db:
assert isinstance(messages[index], UserMessage)
assert messages[index].otid == USER_MESSAGE_OTID
index += 1
# Agent Step 1
assert isinstance(messages[index], ReasoningMessage)
assert messages[index].otid and messages[index].otid[-1] == "0"
index += 1
assert isinstance(messages[index], ToolCallMessage)
assert messages[index].otid and messages[index].otid[-1] == "1"
index += 1
# Agent Step 2
assert isinstance(messages[index], ToolReturnMessage)
assert messages[index].otid and messages[index].otid[-1] == "0"
index += 1
# Hidden User Message
if from_db:
assert isinstance(messages[index], UserMessage)
assert "request_heartbeat=true" in messages[index].content
index += 1
# Agent Step 3
assert isinstance(messages[index], ReasoningMessage)
assert messages[index].otid and messages[index].otid[-1] == "0"
index += 1
assert isinstance(messages[index], AssistantMessage)
assert messages[index].otid and messages[index].otid[-1] == "1"
index += 1
if streaming:
assert isinstance(messages[index], LettaUsageStatistics)
def assert_image_input_response(
messages: List[Any],
streaming: bool = False,
token_streaming: bool = False,
from_db: bool = False,
) -> None:
"""
Asserts that the messages list follows the expected sequence:
ReasoningMessage -> AssistantMessage.
"""
expected_message_count = 3 if streaming or from_db else 2
assert len(messages) == expected_message_count
index = 0
if from_db:
assert isinstance(messages[index], UserMessage)
assert messages[index].otid == USER_MESSAGE_OTID
index += 1
# Agent Step 1
assert isinstance(messages[index], ReasoningMessage)
assert messages[index].otid and messages[index].otid[-1] == "0"
index += 1
assert isinstance(messages[index], AssistantMessage)
assert messages[index].otid and messages[index].otid[-1] == "1"
index += 1
if streaming:
assert isinstance(messages[index], LettaUsageStatistics)
assert messages[index].prompt_tokens > 0
assert messages[index].completion_tokens > 0
assert messages[index].total_tokens > 0
assert messages[index].step_count > 0
def accumulate_chunks(chunks: List[Any]) -> List[Any]:
"""
Accumulates chunks into a list of messages.
"""
messages = []
current_message = None
prev_message_type = None
for chunk in chunks:
current_message_type = chunk.message_type
if prev_message_type != current_message_type:
messages.append(current_message)
current_message = None
if current_message is None:
current_message = chunk
else:
pass # TODO: actually accumulate the chunks. For now we only care about the count
prev_message_type = current_message_type
messages.append(current_message)
return [m for m in messages if m is not None]
def wait_for_run_completion(client: Letta, run_id: str, timeout: float = 30.0, interval: float = 0.5) -> Run:
start = time.time()
while True:
run = client.runs.retrieve(run_id)
if run.status == "completed":
return run
if run.status == "failed":
raise RuntimeError(f"Run {run_id} did not complete: status = {run.status}")
if time.time() - start > timeout:
raise TimeoutError(f"Run {run_id} did not complete within {timeout} seconds (last status: {run.status})")
time.sleep(interval)
def assert_tool_response_dict_messages(messages: List[Dict[str, Any]]) -> None:
"""
Asserts that a list of message dictionaries contains the expected types and statuses.
Expected order:
1. reasoning_message
2. tool_call_message
3. tool_return_message (with status 'success')
4. reasoning_message
5. assistant_message
"""
assert isinstance(messages, list)
assert messages[0]["message_type"] == "reasoning_message"
assert messages[1]["message_type"] == "assistant_message"
# ------------------------------
# Test Cases
# ------------------------------
# def test_that_ci_workflow_works(
# disable_e2b_api_key: Any,
# client: Letta,
# agent_state: AgentState,
# llm_config: LLMConfig,
# json_metadata: pytest.FixtureRequest,
# ) -> None:
# """
# Tests that the CI workflow works.
# """
# json_metadata["test_type"] = "debug"
def test_greeting_with_assistant_message(
disable_e2b_api_key: Any,
client: Letta,
agent_state: AgentState,
llm_config: LLMConfig,
json_metadata: pytest.FixtureRequest,
) -> None:
"""
Tests sending a message with a synchronous client.
Verifies that the response messages follow the expected order.
"""
json_metadata["llm_config"] = dict(llm_config)
last_message = client.agents.messages.list(agent_id=agent_state.id, limit=1)
agent_state = client.agents.modify(agent_id=agent_state.id, llm_config=llm_config)
response = client.agents.messages.create(
agent_id=agent_state.id,
messages=USER_MESSAGE_FORCE_REPLY,
)
assert_greeting_with_assistant_message_response(response.messages)
messages_from_db = client.agents.messages.list(agent_id=agent_state.id, after=last_message[0].id)
assert_greeting_with_assistant_message_response(messages_from_db, from_db=True)
def test_greeting_without_assistant_message(
disable_e2b_api_key: Any,
client: Letta,
llm_config: LLMConfig,
agent_state: AgentState,
json_metadata: pytest.FixtureRequest,
) -> None:
"""
Tests sending a message with a synchronous client.
Verifies that the response messages follow the expected order.
"""
json_metadata["llm_config"] = dict(llm_config)
last_message = client.agents.messages.list(agent_id=agent_state.id, limit=1)
agent_state = client.agents.modify(agent_id=agent_state.id, llm_config=llm_config)
response = client.agents.messages.create(
agent_id=agent_state.id,
messages=USER_MESSAGE_FORCE_REPLY,
use_assistant_message=False,
)
assert_greeting_without_assistant_message_response(response.messages)
messages_from_db = client.agents.messages.list(agent_id=agent_state.id, after=last_message[0].id, use_assistant_message=False)
assert_greeting_without_assistant_message_response(messages_from_db, from_db=True)
def test_tool_call(
disable_e2b_api_key: Any,
client: Letta,
llm_config: LLMConfig,
agent_state: AgentState,
json_metadata: pytest.FixtureRequest,
) -> None:
"""
Tests sending a message with a synchronous client.
Verifies that the response messages follow the expected order.
"""
json_metadata["llm_config"] = dict(llm_config)
dice_tool = client.tools.upsert_from_function(func=roll_dice)
client.agents.tools.attach(agent_id=agent_state.id, tool_id=dice_tool.id)
last_message = client.agents.messages.list(agent_id=agent_state.id, limit=1)
agent_state = client.agents.modify(agent_id=agent_state.id, llm_config=llm_config)
response = client.agents.messages.create(
agent_id=agent_state.id,
messages=USER_MESSAGE_ROLL_DICE,
)
assert_tool_call_response(response.messages)
messages_from_db = client.agents.messages.list(agent_id=agent_state.id, after=last_message[0].id)
assert_tool_call_response(messages_from_db, from_db=True)
def test_url_image_input(
disable_e2b_api_key: Any,
client: Letta,
llm_config: LLMConfig,
agent_state: AgentState,
json_metadata: pytest.FixtureRequest,
) -> None:
"""
Tests sending a message with a synchronous client.
Verifies that the response messages follow the expected order.
"""
json_metadata["llm_config"] = dict(llm_config)
last_message = client.agents.messages.list(agent_id=agent_state.id, limit=1)
agent_state = client.agents.modify(agent_id=agent_state.id, llm_config=llm_config)
response = client.agents.messages.create(
agent_id=agent_state.id,
messages=USER_MESSAGE_URL_IMAGE,
)
assert_image_input_response(response.messages)
messages_from_db = client.agents.messages.list(agent_id=agent_state.id, after=last_message[0].id)
assert_image_input_response(messages_from_db, from_db=True)
def test_base64_image_input(
disable_e2b_api_key: Any,
client: Letta,
llm_config: LLMConfig,
agent_state: AgentState,
json_metadata: pytest.FixtureRequest,
) -> None:
"""
Tests sending a message with a synchronous client.
Verifies that the response messages follow the expected order.
"""
json_metadata["llm_config"] = dict(llm_config)
last_message = client.agents.messages.list(agent_id=agent_state.id, limit=1)
agent_state = client.agents.modify(agent_id=agent_state.id, llm_config=llm_config)
response = client.agents.messages.create(
agent_id=agent_state.id,
messages=USER_MESSAGE_BASE64_IMAGE,
)
assert_image_input_response(response.messages)
messages_from_db = client.agents.messages.list(agent_id=agent_state.id, after=last_message[0].id)
assert_image_input_response(messages_from_db, from_db=True)
def test_agent_loop_error(
disable_e2b_api_key: Any,
client: Letta,
llm_config: LLMConfig,
agent_state: AgentState,
json_metadata: pytest.FixtureRequest,
) -> None:
"""
Tests sending a message with a synchronous client.
Verifies that no new messages are persisted on error.
"""
json_metadata["llm_config"] = dict(llm_config)
last_message = client.agents.messages.list(agent_id=agent_state.id, limit=1)
tools = agent_state.tools
agent_state = client.agents.modify(agent_id=agent_state.id, llm_config=llm_config, tool_ids=[])
with pytest.raises(ApiError):
client.agents.messages.create(
agent_id=agent_state.id,
messages=USER_MESSAGE_FORCE_REPLY,
)
messages_from_db = client.agents.messages.list(agent_id=agent_state.id, after=last_message[0].id)
assert len(messages_from_db) == 0
client.agents.modify(agent_id=agent_state.id, tool_ids=[t.id for t in tools])
def test_step_streaming_greeting_with_assistant_message(
disable_e2b_api_key: Any,
client: Letta,
llm_config: LLMConfig,
agent_state: AgentState,
json_metadata: pytest.FixtureRequest,
) -> None:
"""
Tests sending a streaming message with a synchronous client.
Checks that each chunk in the stream has the correct message types.
"""
json_metadata["llm_config"] = dict(llm_config)
last_message = client.agents.messages.list(agent_id=agent_state.id, limit=1)
agent_state = client.agents.modify(agent_id=agent_state.id, llm_config=llm_config)
response = client.agents.messages.create_stream(
agent_id=agent_state.id,
messages=USER_MESSAGE_FORCE_REPLY,
)
chunks = list(response)
messages = accumulate_chunks(chunks)
assert_greeting_with_assistant_message_response(messages, streaming=True)
messages_from_db = client.agents.messages.list(agent_id=agent_state.id, after=last_message[0].id)
assert_greeting_with_assistant_message_response(messages_from_db, from_db=True)
def test_step_streaming_greeting_without_assistant_message(
disable_e2b_api_key: Any,
client: Letta,
llm_config: LLMConfig,
agent_state: AgentState,
json_metadata: pytest.FixtureRequest,
) -> None:
"""
Tests sending a streaming message with a synchronous client.
Checks that each chunk in the stream has the correct message types.
"""
json_metadata["llm_config"] = dict(llm_config)
last_message = client.agents.messages.list(agent_id=agent_state.id, limit=1)
agent_state = client.agents.modify(agent_id=agent_state.id, llm_config=llm_config)
response = client.agents.messages.create_stream(
agent_id=agent_state.id,
messages=USER_MESSAGE_FORCE_REPLY,
use_assistant_message=False,
)
chunks = list(response)
messages = accumulate_chunks(chunks)
assert_greeting_without_assistant_message_response(messages, streaming=True)
messages_from_db = client.agents.messages.list(agent_id=agent_state.id, after=last_message[0].id, use_assistant_message=False)
assert_greeting_without_assistant_message_response(messages_from_db, from_db=True)
def test_step_streaming_tool_call(
disable_e2b_api_key: Any,
client: Letta,
llm_config: LLMConfig,
agent_state: AgentState,
json_metadata: pytest.FixtureRequest,
) -> None:
"""
Tests sending a streaming message with a synchronous client.
Checks that each chunk in the stream has the correct message types.
"""
json_metadata["llm_config"] = dict(llm_config)
dice_tool = client.tools.upsert_from_function(func=roll_dice)
agent_state = client.agents.tools.attach(agent_id=agent_state.id, tool_id=dice_tool.id)
last_message = client.agents.messages.list(agent_id=agent_state.id, limit=1)
agent_state = client.agents.modify(agent_id=agent_state.id, llm_config=llm_config)
response = client.agents.messages.create_stream(
agent_id=agent_state.id,
messages=USER_MESSAGE_ROLL_DICE,
)
chunks = list(response)
messages = accumulate_chunks(chunks)
assert_tool_call_response(messages, streaming=True)
messages_from_db = client.agents.messages.list(agent_id=agent_state.id, after=last_message[0].id)
assert_tool_call_response(messages_from_db, from_db=True)
def test_step_stream_agent_loop_error(
disable_e2b_api_key: Any,
client: Letta,
llm_config: LLMConfig,
agent_state: AgentState,
json_metadata: pytest.FixtureRequest,
) -> None:
"""
Tests sending a message with a synchronous client.
Verifies that no new messages are persisted on error.
"""
json_metadata["llm_config"] = dict(llm_config)
last_message = client.agents.messages.list(agent_id=agent_state.id, limit=1)
tools = agent_state.tools
agent_state = client.agents.modify(agent_id=agent_state.id, llm_config=llm_config, tool_ids=[])
with pytest.raises(ApiError):
response = client.agents.messages.create_stream(
agent_id=agent_state.id,
messages=USER_MESSAGE_FORCE_REPLY,
)
list(response)
messages_from_db = client.agents.messages.list(agent_id=agent_state.id, after=last_message[0].id)
assert len(messages_from_db) == 0
client.agents.modify(agent_id=agent_state.id, tool_ids=[t.id for t in tools])
def test_token_streaming_greeting_with_assistant_message(
disable_e2b_api_key: Any,
client: Letta,
llm_config: LLMConfig,
agent_state: AgentState,
json_metadata: pytest.FixtureRequest,
) -> None:
"""
Tests sending a streaming message with a synchronous client.
Checks that each chunk in the stream has the correct message types.
"""
json_metadata["llm_config"] = dict(llm_config)
last_message = client.agents.messages.list(agent_id=agent_state.id, limit=1)
agent_state = client.agents.modify(agent_id=agent_state.id, llm_config=llm_config)
response = client.agents.messages.create_stream(
agent_id=agent_state.id,
messages=USER_MESSAGE_FORCE_REPLY,
stream_tokens=True,
)
chunks = list(response)
messages = accumulate_chunks(chunks)
assert_greeting_with_assistant_message_response(messages, streaming=True, token_streaming=True)
messages_from_db = client.agents.messages.list(agent_id=agent_state.id, after=last_message[0].id)
assert_greeting_with_assistant_message_response(messages_from_db, from_db=True)
def test_token_streaming_greeting_without_assistant_message(
disable_e2b_api_key: Any,
client: Letta,
llm_config: LLMConfig,
agent_state: AgentState,
json_metadata: pytest.FixtureRequest,
) -> None:
"""
Tests sending a streaming message with a synchronous client.
Checks that each chunk in the stream has the correct message types.
"""
json_metadata["llm_config"] = dict(llm_config)
last_message = client.agents.messages.list(agent_id=agent_state.id, limit=1)
agent_state = client.agents.modify(agent_id=agent_state.id, llm_config=llm_config)
response = client.agents.messages.create_stream(
agent_id=agent_state.id,
messages=USER_MESSAGE_FORCE_REPLY,
use_assistant_message=False,
stream_tokens=True,
)
chunks = list(response)
messages = accumulate_chunks(chunks)
assert_greeting_without_assistant_message_response(messages, streaming=True, token_streaming=True)
messages_from_db = client.agents.messages.list(agent_id=agent_state.id, after=last_message[0].id, use_assistant_message=False)
assert_greeting_without_assistant_message_response(messages_from_db, from_db=True)
def test_token_streaming_tool_call(
disable_e2b_api_key: Any,
client: Letta,
llm_config: LLMConfig,
agent_state: AgentState,
json_metadata: pytest.FixtureRequest,
) -> None:
"""
Tests sending a streaming message with a synchronous client.
Checks that each chunk in the stream has the correct message types.
"""
json_metadata["llm_config"] = dict(llm_config)
dice_tool = client.tools.upsert_from_function(func=roll_dice)
agent_state = client.agents.tools.attach(agent_id=agent_state.id, tool_id=dice_tool.id)
last_message = client.agents.messages.list(agent_id=agent_state.id, limit=1)
agent_state = client.agents.modify(agent_id=agent_state.id, llm_config=llm_config)
response = client.agents.messages.create_stream(
agent_id=agent_state.id,
messages=USER_MESSAGE_ROLL_DICE,
stream_tokens=True,
)
chunks = list(response)
messages = accumulate_chunks(chunks)
assert_tool_call_response(messages, streaming=True)
messages_from_db = client.agents.messages.list(agent_id=agent_state.id, after=last_message[0].id)
assert_tool_call_response(messages_from_db, from_db=True)
def test_token_streaming_agent_loop_error(
disable_e2b_api_key: Any,
client: Letta,
llm_config: LLMConfig,
agent_state: AgentState,
json_metadata: pytest.FixtureRequest,
) -> None:
"""
Tests sending a message with a synchronous client.
Verifies that no new messages are persisted on error.
"""
json_metadata["llm_config"] = dict(llm_config)
last_message = client.agents.messages.list(agent_id=agent_state.id, limit=1)
tools = agent_state.tools
agent_state = client.agents.modify(agent_id=agent_state.id, llm_config=llm_config, tool_ids=[])
try:
response = client.agents.messages.create_stream(
agent_id=agent_state.id,
messages=USER_MESSAGE_FORCE_REPLY,
stream_tokens=True,
)
list(response)
except:
pass # only some models throw an error TODO: make this consistent
messages_from_db = client.agents.messages.list(agent_id=agent_state.id, after=last_message[0].id)
assert len(messages_from_db) == 0
client.agents.modify(agent_id=agent_state.id, tool_ids=[t.id for t in tools])
def test_async_greeting_with_assistant_message(
disable_e2b_api_key: Any,
client: Letta,
llm_config: LLMConfig,
agent_state: AgentState,
json_metadata: pytest.FixtureRequest,
) -> None:
"""
Tests sending a message as an asynchronous job using the synchronous client.
Waits for job completion and asserts that the result messages are as expected.
"""
json_metadata["llm_config"] = dict(llm_config)
client.agents.modify(agent_id=agent_state.id, llm_config=llm_config)
run = client.agents.messages.create_async(
agent_id=agent_state.id,
messages=USER_MESSAGE_FORCE_REPLY,
)
run = wait_for_run_completion(client, run.id)
result = run.metadata.get("result")
assert result is not None, "Run metadata missing 'result' key"
messages = result["messages"]
assert_tool_response_dict_messages(messages)
def test_auto_summarize(
disable_e2b_api_key: Any,
client: Letta,
llm_config: LLMConfig,
json_metadata: pytest.FixtureRequest,
) -> None:
"""Test that summarization is automatically triggered."""
json_metadata["llm_config"] = dict(llm_config)
# pydantic prevents us for overriding the context window paramter in the passed LLMConfig
new_llm_config = llm_config.model_dump()
new_llm_config["context_window"] = 3000
pinned_context_window_llm_config = LLMConfig(**new_llm_config)
send_message_tool = client.tools.list(name="send_message")[0]
temp_agent_state = client.agents.create(
include_base_tools=False,
tool_ids=[send_message_tool.id],
llm_config=pinned_context_window_llm_config,
embedding="letta/letta-free",
tags=["supervisor"],
)
philosophical_question = """
You know, sometimes I wonder if the entire structure of our lives is built on a series of unexamined assumptions we just silently agreed to somewhere along the way—like how we all just decided that five days a week of work and two days of “rest” constitutes balance, or how 9-to-5 became the default rhythm of a meaningful life, or even how the idea of “success” got boiled down to job titles and property ownership and productivity metrics on a LinkedIn profile, when maybe none of that is actually what makes a life feel full, or grounded, or real. And then theres the weird paradox of ambition, how we're taught to chase it like a finish line that keeps moving, constantly redefining itself right as youre about to grasp it—because even when you get the job, or the degree, or the validation, there's always something next, something more, like a treadmill with invisible settings you didnt realize were turned up all the way.
And have you noticed how we rarely stop to ask who set those definitions for us? Like was there ever a council that decided, yes, owning a home by thirty-five and retiring by sixty-five is the universal template for fulfillment? Or did it just accumulate like cultural sediment over generations, layered into us so deeply that questioning it feels uncomfortable, even dangerous? And isnt it strange that we spend so much of our lives trying to optimize things—our workflows, our diets, our sleep, our morning routines—as though the point of life is to operate more efficiently rather than to experience it more richly? We build these intricate systems, these rulebooks for being a “high-functioning” human, but where in all of that is the space for feeling lost, for being soft, for wandering without a purpose just because its a sunny day and your heart is tugging you toward nowhere in particular?
Sometimes I lie awake at night and wonder if all the noise we wrap around ourselves—notifications, updates, performance reviews, even our internal monologues—might be crowding out the questions we were meant to live into slowly, like how to love better, or how to forgive ourselves, or what the hell were even doing here in the first place. And when you strip it all down—no goals, no KPIs, no curated identity—whats actually left of us? Are we just a sum of the roles we perform, or is there something quieter underneath that we've forgotten how to hear?
And if there is something underneath all of it—something real, something worth listening to—then how do we begin to uncover it, gently, without rushing or reducing it to another task on our to-do list?
"""
MAX_ATTEMPTS = 10
prev_length = None
for attempt in range(MAX_ATTEMPTS):
client.agents.messages.create(
agent_id=temp_agent_state.id,
messages=[MessageCreate(role="user", content=philosophical_question)],
)
temp_agent_state = client.agents.retrieve(agent_id=temp_agent_state.id)
message_ids = temp_agent_state.message_ids
current_length = len(message_ids)
print("LENGTH OF IN_CONTEXT_MESSAGES:", current_length)
if prev_length is not None and current_length <= prev_length:
# TODO: Add more stringent checks here
print(f"Summarization was triggered, detected current_length {current_length} is at least prev_length {prev_length}.")
break
prev_length = current_length
else:
raise AssertionError("Summarization was not triggered after 10 messages")

File diff suppressed because it is too large Load Diff

View File

@@ -1,113 +0,0 @@
name: Alembic Migration Validation
on:
pull_request:
branches: [ main ]
pull_request_target:
branches: [ main ]
types: [labeled]
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
jobs:
changed-files:
# Run on pull_request OR on pull_request_target only when labeled "safe to test"
if: github.event_name == 'pull_request' || (github.event_name == 'pull_request_target' && contains(github.event.pull_request.labels.*.name, 'safe to test'))
runs-on: ubuntu-latest
name: changed-files
outputs:
all_changed_files: ${{ steps.changed-files.outputs.all_changed_files }}
any_changed: ${{ steps.changed-files.outputs.any_changed }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Get changed files
id: changed-files
uses: tj-actions/changed-files@v44
with:
files: |
apps/core/alembic/**
.github/workflows/alembic-validation.yml
test-sqlite:
needs: [ changed-files ]
if: ${{ needs.changed-files.outputs.any_changed == 'true' }}
runs-on: [self-hosted, medium]
timeout-minutes: 15
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Install dependencies
shell: bash
working-directory: apps/core
run: uv sync --no-install-project ${{ inputs.install-args || '--extra sqlite --extra external-tools --extra dev --extra cloud-tool-sandbox' }}
- name: Test alembic migration
working-directory: apps/core
run: |
uv run alembic upgrade head
# kinda janky but I think this might not matter for sqlite?
# uv run alembic check
- name: Cleanup persistent data
if: ${{ always() }}
working-directory: apps/core
run: |
echo "Cleaning up persistent data..."
sudo rm -rf ~/.letta || true
test-postgres:
needs: [ changed-files ]
if: ${{ needs.changed-files.outputs.any_changed == 'true' }}
runs-on: [self-hosted, medium]
timeout-minutes: 15
services:
postgres:
image: pgvector/pgvector:pg17
ports:
- 5432:5432
env:
POSTGRES_HOST_AUTH_METHOD: trust
POSTGRES_DB: postgres
POSTGRES_USER: postgres
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Install dependencies
shell: bash
working-directory: apps/core
run: uv sync --no-install-project ${{ inputs.install-args || '--extra postgres --extra external-tools --extra dev --extra cloud-tool-sandbox' }}
- name: Test alembic migration
working-directory: apps/core
env:
LETTA_PG_PORT: 5432
LETTA_PG_USER: postgres
LETTA_PG_PASSWORD: postgres
LETTA_PG_DB: postgres
LETTA_PG_HOST: localhost
run: |
psql -h localhost -U postgres -d postgres -c 'CREATE EXTENSION IF NOT EXISTS vector;'
uv run alembic upgrade head
uv run alembic check
- name: Print docker logs if tests fail
if: ${{ failure() || cancelled() }}
run: |
echo "Printing Docker Logs..."
docker logs $(docker ps -aq --filter "ancestor=pgvector/pgvector:pg17") || true
- name: Cleanup containers and volumes
if: ${{ always() }}
run: |
echo "Cleaning up containers and volumes..."
docker stop $(docker ps -aq --filter "ancestor=pgvector/pgvector:pg17") || true
docker rm $(docker ps -aq --filter "ancestor=pgvector/pgvector:pg17") || true
docker volume prune -f || true
docker system prune -f || true

View File

@@ -1,22 +0,0 @@
name: Close inactive issues
on:
schedule:
- cron: "30 1 * * *"
jobs:
close-issues:
runs-on: ubuntu-latest
permissions:
issues: write
pull-requests: write
steps:
- uses: actions/stale@v5
with:
days-before-issue-stale: 30
days-before-issue-close: 14
stale-issue-label: "stale"
stale-issue-message: "This issue is stale because it has been open for 30 days with no activity."
close-issue-message: "This issue was closed because it has been inactive for 14 days since being marked as stale."
days-before-pr-stale: -1
days-before-pr-close: -1
repo-token: ${{ secrets.GITHUB_TOKEN }}

View File

@@ -1,59 +0,0 @@
name: Code Style Checks
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
jobs:
style-checks:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.11"] # Removed 3.12+ as minimal sets the standard. Adjust Python version matrix if needed
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
ref: ${{ github.head_ref }} # Checkout the PR branch
fetch-depth: 0 # Fetch all history for all branches and tags
- name: Set up python
id: setup-python
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Install uv
uses: astral-sh/setup-uv@v6
with:
enable-cache: true
activate-environment: true
- name: Install Dependencies
run: |
uv sync --extra dev --extra postgres --extra external-tools
- name: Validate PR Title
if: github.event_name == 'pull_request'
uses: amannn/action-semantic-pull-request@v5
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Run Pyright
uses: jakebailey/pyright-action@v2
with:
python-version: ${{ matrix.python-version }}
level: "error"
continue-on-error: true
- name: Run isort
run: uv run isort --profile black --check-only --diff .
- name: Run Black
run: uv run black --check .
- name: Run Autoflake
run: uv run autoflake --remove-all-unused-imports --remove-unused-variables --in-place --recursive --ignore-init-module-imports .

View File

@@ -1,51 +0,0 @@
name: 🐍🧪 [Core] Integration Tests
on:
pull_request:
branches:
- main
pull_request_target:
branches:
- main
types: [labeled]
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
jobs:
integration-tests:
# Run on pull_request OR on pull_request_target only when labeled "safe to test"
if: github.event_name == 'pull_request' || (github.event_name == 'pull_request_target' && contains(github.event.pull_request.labels.*.name, 'safe to test'))
uses: ./.github/workflows/reusable-test-workflow.yml
with:
test-type: 'integration'
use-redis: true
changed-files-pattern: |
apps/core/**
.github/workflows/reusable-test-workflow.yml
.github/workflows/core-integration-tests.yml
install-args: '--extra postgres --extra external-tools --extra dev --extra cloud-tool-sandbox'
timeout-minutes: 15
ref: ${{ github.event.pull_request.head.sha || github.sha }}
matrix-strategy: |
{
"fail-fast": false,
"matrix": {
"test_suite": [
"integration_test_summarizer.py",
"integration_test_async_tool_sandbox.py",
"integration_test_sleeptime_agent.py",
"integration_test_agent_tool_graph.py",
"integration_test_composio.py",
"integration_test_chat_completions.py",
"integration_test_multi_agent.py",
"integration_test_batch_api_cron_jobs.py",
"integration_test_batch_sdk.py",
"integration_test_builtin_tools.py",
"integration_test_turbopuffer.py",
"integration_test_human_in_the_loop.py"
]
}
}
secrets: inherit

View File

@@ -1,63 +0,0 @@
name: 🐍🧹 [Core] Lint and Test
on:
pull_request:
branches: [ main ]
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
jobs:
changed-files:
runs-on: ubuntu-latest
name: changed-files
outputs:
all_changed_files: ${{ steps.changed-files.outputs.all_changed_files }}
any_changed: ${{ steps.changed-files.outputs.any_changed }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Get changed files
id: changed-files
uses: tj-actions/changed-files@v44
with:
files: |
apps/core/**
.github/workflows/core-lint.yml
main:
needs: [ changed-files ]
if: ${{ needs.changed-files.outputs.any_changed == 'true' }}
runs-on: [self-hosted, medium]
strategy:
matrix:
python-version: ["3.12"] # Adjust Python version matrix if needed
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Install dependencies
shell: bash
working-directory: apps/core
run: uv sync --no-install-project ${{ inputs.install-args || '--extra postgres --extra external-tools --extra dev --extra cloud-tool-sandbox' }}
- name: Validate PR Title
if: github.event_name == 'pull_request'
uses: amannn/action-semantic-pull-request@v5
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Run Pyright
uses: jakebailey/pyright-action@v2
with:
python-version: ${{ matrix.python-version }}
level: "error"
continue-on-error: true
- name: Run Ruff Check
working-directory: apps/core
run: uv run ruff check --config pyproject.toml --diff .
- name: Run Ruff Format
working-directory: apps/core
run: uv run ruff format --config pyproject.toml --check --diff .

View File

@@ -1,60 +0,0 @@
name: 🐍👨‍🔬 [Core] Unit Tests (SQLite)
on:
pull_request:
branches:
- main
pull_request_target:
branches:
- main
types: [labeled]
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
jobs:
unit-tests:
# Run on pull_request OR on pull_request_target only when labeled "safe to test"
if: github.event_name == 'pull_request' || (github.event_name == 'pull_request_target' && contains(github.event.pull_request.labels.*.name, 'safe to test'))
uses: ./.github/workflows/reusable-test-workflow.yml
with:
test-type: 'sqlite'
use-redis: true
changed-files-pattern: |
apps/core/**
.github/workflows/reusable-test-workflow.yml
.github/workflows/core-unit-sqlite-test.yml
install-args: '--extra postgres --extra external-tools --extra dev --extra cloud-tool-sandbox --extra google --extra sqlite'
timeout-minutes: 15
ref: ${{ github.event.pull_request.head.sha || github.sha }}
matrix-strategy: |
{
"fail-fast": false,
"matrix": {
"include": [
{"test_suite": "test_client.py"},
{"test_suite": "test_sdk_client.py"},
{"test_suite": "test_server.py"},
{"test_suite": "test_tool_schema_parsing.py"},
{"test_suite": "test_tool_rule_solver.py"},
{"test_suite": "test_memory.py"},
{"test_suite": "test_utils.py"},
{"test_suite": "test_stream_buffer_readers.py"},
{"test_suite": "test_agent_serialization.py"},
{"test_suite": "test_optimistic_json_parser.py"},
{"test_suite": "test_llm_clients.py"},
{"test_suite": "test_letta_agent_batch.py"},
{"test_suite": "test_providers.py"},
{"test_suite": "test_sources.py"},
{"test_suite": "test_managers.py"},
{"test_suite": "sdk/"},
{"test_suite": "mcp_tests/", "use_experimental": true},
{"test_suite": "test_timezone_formatting.py"},
{"test_suite": "test_plugins.py"},
{"test_suite": "test_embeddings.py"}
]
}
}
secrets: inherit

View File

@@ -1,60 +0,0 @@
name: 🐍👨‍🔬 [Core] Unit Tests
on:
pull_request:
branches:
- main
pull_request_target:
branches:
- main
types: [labeled]
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
jobs:
unit-tests:
# Run on pull_request OR on pull_request_target only when labeled "safe to test"
if: github.event_name == 'pull_request' || (github.event_name == 'pull_request_target' && contains(github.event.pull_request.labels.*.name, 'safe to test'))
uses: ./.github/workflows/reusable-test-workflow.yml
with:
test-type: 'unit'
use-redis: true
changed-files-pattern: |
apps/core/**
.github/workflows/reusable-test-workflow.yml
.github/workflows/core-unit-test.yml
install-args: '--extra postgres --extra external-tools --extra dev --extra cloud-tool-sandbox --extra google'
timeout-minutes: 15
ref: ${{ github.event.pull_request.head.sha || github.sha }}
matrix-strategy: |
{
"fail-fast": false,
"matrix": {
"include": [
{"test_suite": "test_client.py"},
{"test_suite": "test_sdk_client.py"},
{"test_suite": "test_server.py"},
{"test_suite": "test_managers.py"},
{"test_suite": "test_tool_schema_parsing.py"},
{"test_suite": "test_tool_rule_solver.py"},
{"test_suite": "test_memory.py"},
{"test_suite": "test_utils.py"},
{"test_suite": "test_stream_buffer_readers.py"},
{"test_suite": "test_agent_serialization.py"},
{"test_suite": "test_agent_serialization_v2.py"},
{"test_suite": "test_optimistic_json_parser.py"},
{"test_suite": "test_llm_clients.py"},
{"test_suite": "test_letta_agent_batch.py"},
{"test_suite": "test_providers.py"},
{"test_suite": "test_sources.py"},
{"test_suite": "sdk/"},
{"test_suite": "mcp_tests/", "use_experimental": true},
{"test_suite": "test_timezone_formatting.py"},
{"test_suite": "test_plugins.py"},
{"test_suite": "test_embeddings.py"}
]
}
}
secrets: inherit

View File

@@ -1,40 +0,0 @@
name: Docker Image CI
on:
release:
types: [published]
workflow_dispatch:
jobs:
build:
runs-on: ubuntu-latest
steps:
- name: Login to Docker Hub
uses: docker/login-action@v3
with:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- uses: actions/checkout@v3
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Extract version number
id: extract_version
run: echo "CURRENT_VERSION=$(awk -F '\"' '/version =/ { print $2 }' pyproject.toml | head -n 1)" >> $GITHUB_ENV
- name: Build and push
uses: docker/build-push-action@v6
with:
platforms: linux/amd64,linux/arm64
push: true
tags: |
letta/letta:${{ env.CURRENT_VERSION }}
letta/letta:latest
memgpt/letta:${{ env.CURRENT_VERSION }}
memgpt/letta:latest

View File

@@ -1,33 +0,0 @@
name: Run Docker integration tests
on:
pull_request:
branches:
- main
pull_request_target:
branches:
- main
types: [labeled]
concurrency:
group: docker-tests-${{ github.ref }}
cancel-in-progress: true
jobs:
docker-tests:
# Run on pull_request OR on pull_request_target only when labeled "safe to test"
if: github.event_name == 'pull_request' || (github.event_name == 'pull_request_target' && contains(github.event.pull_request.labels.*.name, 'safe to test'))
uses: ./.github/workflows/reusable-test-workflow.yml
with:
test-type: 'docker'
install-args: '--extra dev --extra postgres --extra sqlite'
timeout-minutes: 15
use-docker: true
runner: '["self-hosted", "medium"]'
ref: ${{ github.event.pull_request.head.sha || github.sha }}
changed-files-pattern: |
apps/core/**
libs/config-core-deploy/**
.github/workflows/reusable-test-workflow.yml
.github/workflows/docker-integration-tests.yaml
secrets: inherit

View File

@@ -1,20 +0,0 @@
name: 🌿 Fern Check
on:
pull_request:
branches: [ main ]
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
jobs:
run:
runs-on: [self-hosted, small]
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Check API is valid
working-directory: apps
run: fern check

View File

@@ -1,19 +0,0 @@
name: Sync Code
on:
push:
branches:
- main
jobs:
notify:
runs-on: ubuntu-latest
if: ${{ !contains(github.event.head_commit.message, '[sync-skip]') }}
steps:
- name: Trigger repository_dispatch
run: |
curl -X POST \
-H "Authorization: token ${{ secrets.SYNC_PAT }}" \
-H "Accept: application/vnd.github.v3+json" \
https://api.github.com/repos/letta-ai/letta-cloud/dispatches \
-d '{"event_type":"oss-update"}'

View File

@@ -1,161 +0,0 @@
name: Lint Command
on:
issue_comment:
types: [created]
workflow_dispatch:
inputs:
pr_number:
description: 'PR number to run lint on'
required: true
permissions:
contents: write
pull-requests: write
issues: write
jobs:
lint-command:
name: Handle /lint command
runs-on: ubuntu-latest
if: |
(github.event_name == 'workflow_dispatch' && github.event.inputs.pr_number) ||
(github.event_name == 'issue_comment' &&
github.event.issue.pull_request &&
contains(github.event.comment.body, '/lint') &&
startsWith(github.event.comment.body, '/lint'))
steps:
- name: Add acknowledgment reaction
if: github.event_name == 'issue_comment'
uses: peter-evans/create-or-update-comment@v4
with:
comment-id: ${{ github.event.comment.id }}
reactions: eyes
- name: Check permissions
if: github.event_name == 'issue_comment'
uses: actions/github-script@v7
with:
script: |
const { data: collaborator } = await github.rest.repos.getCollaboratorPermissionLevel({
owner: context.repo.owner,
repo: context.repo.repo,
username: context.actor
});
if (!['admin', 'write'].includes(collaborator.permission)) {
github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.issue.number,
body: '❌ You need write permissions to run lint commands.'
});
core.setFailed('Insufficient permissions');
}
- name: Get PR information
id: pr
uses: actions/github-script@v7
with:
script: |
const pr_number = context.eventName === 'issue_comment'
? context.issue.number
: ${{ github.event.inputs.pr_number || 'null' }};
const { data: pr } = await github.rest.pulls.get({
owner: context.repo.owner,
repo: context.repo.repo,
pull_number: pr_number
});
core.setOutput('branch', pr.head.ref);
core.setOutput('repo', pr.head.repo.full_name);
core.setOutput('sha', pr.head.sha);
core.setOutput('number', pr_number);
- name: Checkout PR branch
uses: actions/checkout@v4
with:
ref: ${{ steps.pr.outputs.branch }}
token: ${{ secrets.GITHUB_TOKEN }}
fetch-depth: 0
- name: Set up python 3.12
id: setup-python
uses: actions/setup-python@v5
with:
python-version: 3.12
- name: Install uv
uses: astral-sh/setup-uv@v6
with:
enable-cache: false
activate-environment: true
- name: Install dependencies
run: uv sync --extra dev --extra postgres --extra external-tools
working-directory: ./apps/core
# - name: Run ruff check with fixes
# run: uv run ruff check --fix .
#
# - name: Run ruff format
# run: uv run ruff format .
- name: Run isort, black, autoflake
run: uv run isort . --profile black && uv run black . && uv run autoflake --remove-all-unused-imports --remove-unused-variables --in-place --recursive --ignore-init-module-imports .
working-directory: ./apps/core
- name: Check for changes
id: changes
run: |
if [[ -n $(git status --porcelain) ]]; then
echo "changes=true" >> $GITHUB_OUTPUT
else
echo "changes=false" >> $GITHUB_OUTPUT
fi
- name: Commit and push changes
if: steps.changes.outputs.changes == 'true'
run: |
git config --global user.name "github-actions[bot]"
git config --global user.email "github-actions[bot]@users.noreply.github.com"
git add .
git commit -m "style: lint / fmt
Triggered by /lint command from @${{ github.actor }}"
git push
- name: Comment on success
if: steps.changes.outputs.changes == 'true'
uses: peter-evans/create-or-update-comment@v4
with:
issue-number: ${{ steps.pr.outputs.number }}
body: |
✅ **Lint fixes applied successfully!**
Ruff has automatically fixed linting issues and formatted the code.
Changes have been committed to the PR branch.
- name: Comment on no changes
if: steps.changes.outputs.changes == 'false'
uses: peter-evans/create-or-update-comment@v4
with:
issue-number: ${{ steps.pr.outputs.number }}
body: |
✅ **No lint issues found!**
The code is already properly formatted and passes all linting checks.
- name: Comment on failure
if: failure()
uses: peter-evans/create-or-update-comment@v4
with:
issue-number: ${{ steps.pr.outputs.number }}
body: |
❌ **Lint command failed!**
There was an error while running the lint fixes. Please check the [workflow run](${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}) for details.

View File

@@ -1,25 +0,0 @@
name: Clear Old Issues
on:
workflow_dispatch:
jobs:
cleanup-old-issues:
runs-on: ubuntu-latest
permissions:
issues: write
pull-requests: write
steps:
- uses: actions/stale@v5
with:
days-before-issue-stale: 60
days-before-issue-close: 0
stale-issue-label: "auto-closed"
stale-issue-message: ""
close-issue-message: "This issue has been automatically closed due to 60 days of inactivity."
days-before-pr-stale: -1
days-before-pr-close: -1
exempt-issue-labels: ""
only-issue-labels: ""
remove-stale-when-updated: true
operations-per-run: 1000
repo-token: ${{ secrets.GITHUB_TOKEN }}

View File

@@ -1,54 +0,0 @@
name: Alembic Migration Tester
on:
pull_request:
paths:
- '**.py'
workflow_dispatch:
jobs:
test:
runs-on: ubuntu-latest
timeout-minutes: 15
services:
postgres:
image: pgvector/pgvector:pg17
ports:
- 5432:5432
env:
POSTGRES_HOST_AUTH_METHOD: trust
POSTGRES_DB: postgres
POSTGRES_USER: postgres
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
steps:
- name: Checkout
uses: actions/checkout@v4
- run: psql -h localhost -U postgres -d postgres -c 'CREATE EXTENSION vector'
- name: Set up python 3.11
id: setup-python
uses: actions/setup-python@v5
with:
python-version: 3.11
- name: Install uv
uses: astral-sh/setup-uv@v6
with:
enable-cache: true
- name: Install Dependencies
run: |
uv sync --all-extras
- name: Test alembic migration
env:
LETTA_PG_PORT: 5432
LETTA_PG_USER: postgres
LETTA_PG_PASSWORD: postgres
LETTA_PG_DB: postgres
LETTA_PG_HOST: localhost
run: |
uv run alembic upgrade head
uv run alembic check

View File

@@ -1,144 +0,0 @@
name: Model Sweep
on:
workflow_dispatch:
inputs:
branch-name:
required: true
type: string
jobs:
model-sweep:
runs-on: [self-hosted, medium]
services:
qdrant:
image: qdrant/qdrant
ports:
- 6333:6333
postgres:
image: pgvector/pgvector:pg17
ports:
- 5432:5432
env:
POSTGRES_HOST_AUTH_METHOD: trust
POSTGRES_DB: postgres
POSTGRES_USER: postgres
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
steps:
- name: Check if gh is installed
run: |
if ! command -v gh >/dev/null 2>&1
then
echo "gh could not be found, installing now..."
# install gh cli
(type -p wget >/dev/null || (sudo apt update && sudo apt-get install wget -y)) \
&& sudo mkdir -p -m 755 /etc/apt/keyrings \
&& out=$(mktemp) && wget -nv -O$out https://cli.github.com/packages/githubcli-archive-keyring.gpg \
&& cat $out | sudo tee /etc/apt/keyrings/githubcli-archive-keyring.gpg > /dev/null \
&& sudo chmod go+r /etc/apt/keyrings/githubcli-archive-keyring.gpg \
&& echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" | sudo tee /etc/apt/sources.list.d/github-cli.list > /dev/null \
&& sudo apt update \
&& sudo apt install gh -y
fi
- name: Checkout
uses: actions/checkout@v4
- name: Inject env vars into environment
run: |
# Get secrets and mask them before adding to environment
while IFS= read -r line || [[ -n "$line" ]]; do
if [[ -n "$line" ]]; then
value=$(echo "$line" | cut -d= -f2-)
echo "::add-mask::$value"
echo "$line" >> $GITHUB_ENV
fi
done < <(letta_secrets_helper --env dev --service ci)
- name: Install dependencies
shell: bash
run: uv sync --extra dev --extra postgres --extra external-tools --extra cloud-tool-sandbox --extra google
- name: Migrate database
env:
LETTA_PG_PORT: 5432
LETTA_PG_USER: postgres
LETTA_PG_PASSWORD: postgres
LETTA_PG_DB: postgres
LETTA_PG_HOST: localhost
run: |
psql -h localhost -U postgres -d postgres -c 'CREATE EXTENSION vector'
uv run alembic upgrade head
- name: Run integration tests
# if any of the 1000+ test cases fail, pytest reports exit code 1 and won't procces/upload the results
continue-on-error: true
env:
LETTA_PG_PORT: 5432
LETTA_PG_USER: postgres
LETTA_PG_PASSWORD: postgres
LETTA_PG_DB: postgres
LETTA_PG_HOST: localhost
LETTA_SERVER_PASS: test_server_token
OPENAI_API_KEY: ${{ env.OPENAI_API_KEY }}
ANTHROPIC_API_KEY: ${{ env.ANTHROPIC_API_KEY }}
AZURE_API_KEY: ${{ env.AZURE_API_KEY }}
AZURE_BASE_URL: ${{ secrets.AZURE_BASE_URL }}
GEMINI_API_KEY: ${{ env.GEMINI_API_KEY }}
COMPOSIO_API_KEY: ${{ env.COMPOSIO_API_KEY }}
GOOGLE_CLOUD_PROJECT: ${{ secrets.GOOGLE_CLOUD_PROJECT}}
GOOGLE_CLOUD_LOCATION: ${{ secrets.GOOGLE_CLOUD_LOCATION}}
DEEPSEEK_API_KEY: ${{ env.DEEPSEEK_API_KEY}}
LETTA_USE_EXPERIMENTAL: 1
run: |
uv run pytest \
-s -vv \
.github/scripts/model-sweep/model_sweep.py \
--json-report --json-report-file=.github/scripts/model-sweep/model_sweep_report.json --json-report-indent=4
- name: Convert report to markdown
continue-on-error: true
# file path args to generate_model_sweep_markdown.py are relative to the script
run: |
uv run python \
.github/scripts/model-sweep/generate_model_sweep_markdown.py \
.github/scripts/model-sweep/model_sweep_report.json \
.github/scripts/model-sweep/supported-models.mdx
echo "Model sweep report saved to .github/scripts/model-sweep/supported-models.mdx"
- id: date
run: echo "date=$(date +%Y-%m-%d)" >> $GITHUB_OUTPUT
- name: commit and open pull request
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
BRANCH_NAME=model-sweep/${{ inputs.branch-name || format('{0}', steps.date.outputs.date) }}
gh auth setup-git
git config --global user.name "github-actions[bot]"
git config --global user.email "github-actions[bot]@users.noreply.github.com"
git checkout -b $BRANCH_NAME
git add .github/scripts/model-sweep/supported-models.mdx
git commit -m "Update model sweep report"
# only push if changes were made
if git diff main --quiet; then
echo "No changes detected, skipping push"
exit 0
else
git push origin $BRANCH_NAME
gh pr create \
--base main \
--head $BRANCH_NAME \
--title "chore: update model sweep report" \
--body "Automated PR to update model sweep report"
fi
- name: Upload model sweep report
if: always()
uses: actions/upload-artifact@v4
with:
name: model-sweep-report
path: .github/scripts/model-sweep/model_sweep_report.json

View File

@@ -1,19 +0,0 @@
name: Notify Letta Cloud
on:
push:
branches:
- main
jobs:
notify:
runs-on: ubuntu-latest
if: ${{ !contains(github.event.head_commit.message, '[sync-skip]') }}
steps:
- name: Trigger repository_dispatch
run: |
curl -X POST \
-H "Authorization: token ${{ secrets.SYNC_PAT }}" \
-H "Accept: application/vnd.github.v3+json" \
https://api.github.com/repos/letta-ai/letta-cloud/dispatches \
-d '{"event_type":"oss-update"}'

View File

@@ -1,65 +0,0 @@
name: uv-publish-nightly
on:
schedule:
- cron: '35 10 * * *' # 10:35am UTC, 2:35am PST, 5:35am EST
release:
types: [published]
workflow_dispatch:
jobs:
# nightly release check from https://stackoverflow.com/a/67527144
check-date:
runs-on: ubuntu-latest
outputs:
should_run: ${{ steps.should_run.outputs.should_run }}
steps:
- uses: actions/checkout@v4
- name: print latest_commit
run: echo ${{ github.sha }}
- id: should_run
continue-on-error: true
name: check latest commit is less than a day
if: ${{ github.event_name == 'schedule' }}
run: test -z $(git rev-list --after="24 hours" ${{ github.sha }}) && echo "::set-output name=should_run::false"
build-and-publish-nightly:
name: Build and Publish to PyPI (nightly)
if: github.repository == 'letta-ai/letta' # TODO: if the repo org ever changes, this must be updated
runs-on: ubuntu-latest
needs: check-date
steps:
- name: Check out the repository
uses: actions/checkout@v4
- name: Set up python 3.12
id: setup-python
uses: actions/setup-python@v5
with:
python-version: 3.12
- name: Install uv
uses: astral-sh/setup-uv@v6
with:
enable-cache: true
activate-environment: true
- name: Set release version
run: |
# Extract the version number from pyproject.toml using awk
CURRENT_VERSION=$(awk -F '"' '/version =/ { print $2 }' pyproject.toml | head -n 1)
# Export the CURRENT_VERSION with the .dev and current date suffix
NIGHTLY_VERSION="${CURRENT_VERSION}.dev$(date +%Y%m%d%H%M%S)"
# Overwrite pyproject.toml with nightly config
sed -i "0,/version = \"${CURRENT_VERSION}\"/s//version = \"${NIGHTLY_VERSION}\"/" pyproject.toml
sed -i 's/name = "letta"/name = "letta-nightly"/g' pyproject.toml
sed -i "s/__version__ = '.*'/__version__ = '${NIGHTLY_VERSION}'/g" letta/__init__.py
cat pyproject.toml
cat letta/__init__.py
- name: Build the Python package
run: uv build
- name: Publish the package to PyPI
env:
UV_PUBLISH_TOKEN: ${{ secrets.PYPI_TOKEN }}
run: uv publish

View File

@@ -1,35 +0,0 @@
name: uv-publish
on:
release:
types: [published]
workflow_dispatch:
jobs:
build-and-publish:
name: Build and Publish to PyPI
if: github.repository == 'letta-ai/letta' # TODO: if the repo org ever changes, this must be updated
runs-on: ubuntu-latest
steps:
- name: Check out the repository
uses: actions/checkout@v4
- name: Set up python 3.12
id: setup-python
uses: actions/setup-python@v5
with:
python-version: 3.12
- name: Install uv
uses: astral-sh/setup-uv@v6
with:
enable-cache: true
activate-environment: true
cache-dependency-glob: "uv.lock"
- name: Build the Python package
run: uv build
- name: Publish the package to PyPI
env:
UV_PUBLISH_TOKEN: ${{ secrets.PYPI_TOKEN }}
run: uv publish

View File

@@ -1,474 +0,0 @@
name: Reusable Test Workflow
on:
workflow_call:
inputs:
test-type:
description: 'Type of tests to run (unit, integration, docker, send-message, sqlite)'
required: true
type: string
core-directory:
description: 'Working directory for commands. Auto-detects between apps/core (cloud) and . (OSS). Can be overridden.'
required: false
type: string
default: 'auto'
install-args:
description: 'uv sync arguments'
required: true
type: string
test-command:
description: 'Command to run tests'
required: false
type: string
default: 'uv run --frozen pytest -svv'
test-path-prefix:
description: 'Prefix for test path (e.g., tests/)'
required: false
type: string
default: 'tests/'
timeout-minutes:
description: 'Timeout in minutes'
required: false
type: number
default: 15
runner:
description: 'Runner to use'
required: false
type: string
default: '["self-hosted", "small"]'
matrix-strategy:
description: 'JSON string for matrix strategy'
required: false
type: string
default: '{}'
changed-files-pattern:
description: 'Pattern for changed files detection'
required: false
type: string
default: |
apps/core/**
.github/workflows/reusable-test-workflow.yml
skip-fern-generation:
description: 'Skip Fern SDK generation'
required: false
type: boolean
default: false
use-docker:
description: 'Use Docker for tests'
required: false
type: boolean
default: false
ref:
description: 'Git ref to wait for checks on'
required: false
type: string
default: ${{ github.sha }}
use-redis:
description: 'Use Redis for tests'
required: false
type: boolean
default: false
jobs:
changed-files:
runs-on: ${{ fromJSON(inputs.runner) }}
name: changed-files
outputs:
all_changed_files: ${{ steps.changed-files.outputs.all_changed_files }}
any_changed: ${{ steps.changed-files.outputs.any_changed }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Get changed files
id: changed-files
uses: tj-actions/changed-files@v46.0.4
with:
files: ${{ inputs.changed-files-pattern }}
cache-check:
needs: [changed-files]
runs-on: ${{ fromJSON(inputs.runner) }}
name: Check cache key
outputs:
cache_key: ${{ steps.cache-key.outputs.key }}
cache_hit: ${{ steps.cache.outputs.cache-hit }}
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Generate cache key
if: inputs.skip-fern-generation != true || (!contains(needs.changed-files.outputs.all_changed_files, 'apps/fern/openapi.json') && !contains(needs.changed-files.outputs.all_changed_files, 'apps/fern/openapi-overrides.yml'))
id: cache-key
run: |
echo "key=sdk-${{ github.ref_name }}-${{ hashFiles('apps/fern/*', 'apps/core/pyproject.toml') }}" >> $GITHUB_OUTPUT
- name: Restore SDK cache
# skip if "skip-fern-generation" is true or if the upstream workflow would've generated an sdk preview (changes to openapi files)
if: inputs.skip-fern-generation != true || (!contains(needs.changed-files.outputs.all_changed_files, 'apps/fern/openapi.json') && !contains(needs.changed-files.outputs.all_changed_files, 'apps/fern/openapi-overrides.yml'))
id: cache
uses: actions/cache/restore@v4
with:
path: |
apps/fern/.preview/fern-python-sdk/
key: ${{ steps.cache-key.outputs.key }}
fail-on-cache-miss: false
block-until-sdk-preview-finishes:
needs: [changed-files, cache-check]
if: |
needs.cache-check.outputs.cache_hit != 'true'
timeout-minutes: ${{ inputs.timeout-minutes }}
runs-on: ${{ fromJSON(inputs.runner) }}
name: block-until-sdk-preview-finishes
steps:
- name: Debug ref information
run: |
echo "Input ref: ${{ inputs.ref }}"
echo "GitHub SHA: ${{ github.sha }}"
echo "GitHub ref: ${{ github.ref }}"
echo "PR head SHA: ${{ github.event.pull_request.head.sha }}"
echo "Event name: ${{ github.event_name }}"
- name: Wait for Preview SDK workflow
if: inputs.skip-fern-generation != true || (!contains(needs.changed-files.outputs.all_changed_files, 'apps/fern/openapi.json') && !contains(needs.changed-files.outputs.all_changed_files, 'apps/fern/openapi-overrides.yml'))
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
echo "Waiting for 'preview-python-sdk' check to complete on ref: ${{ inputs.ref }}"
# Wait for the check to complete with timeout
timeout_seconds=1800
interval_seconds=60
elapsed=0
while [ $elapsed -lt $timeout_seconds ]; do
echo "Checking status... (elapsed: ${elapsed}s)"
# Get check runs using pr checks syntax with branch name or PR number
if [ "${{ github.event_name }}" = "pull_request" ]; then
pr_identifier="${{ github.event.pull_request.number }}"
else
pr_identifier="${{ github.ref_name }}"
fi
check_info=$(gh pr checks "$pr_identifier" -R ${{ github.repository }} --json name,state,startedAt \
| jq -r '.[] | select(.name == "preview-python-sdk") | [.startedAt, .state] | @tsv' | sort -r | head -1 | cut -f2)
if [ -n "$check_info" ]; then
echo "Check state: $check_info"
if [ "$check_info" = "SUCCESS" ] || [ "$check_info" = "SKIPPED" ]; then
echo "Check completed with state: $check_info"
exit 0
elif [ "$check_info" = "FAILURE" ] || [ "$check_info" = "CANCELLED" ]; then
echo "❌ Preview Python SDK build failed with state: $check_info"
echo "🚫 Blocking dependent test jobs to prevent extraneous failures"
echo "📋 To fix: Check the 'preview-python-sdk' job logs for build errors"
exit 1
fi
else
echo "Check 'preview-python-sdk' not found yet"
fi
sleep $interval_seconds
elapsed=$((elapsed + interval_seconds))
done
echo "Timeout waiting for check to complete"
exit 1
test-run:
needs: [changed-files, block-until-sdk-preview-finishes]
if: |
always() &&
needs.changed-files.outputs.any_changed == 'true' &&
(needs.block-until-sdk-preview-finishes.result == 'success' ||
needs.block-until-sdk-preview-finishes.result == 'skipped')
runs-on: ${{ fromJSON(inputs.runner) }}
timeout-minutes: ${{ inputs.timeout-minutes }}
strategy: ${{ fromJSON(inputs.matrix-strategy) }}
services:
postgres:
image: pgvector/pgvector:pg17
ports:
# avoids conflict with docker postgres
- ${{ inputs.use-docker && '9999:5432' || '5432:5432' }}
env:
POSTGRES_HOST_AUTH_METHOD: trust
POSTGRES_DB: postgres
POSTGRES_USER: postgres
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
redis:
image: ${{ inputs.use-redis && 'redis:8-alpine' || '' }}
options: >-
--health-cmd "redis-cli ping"
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
- 6379:6379
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Install uv
uses: astral-sh/setup-uv@v6
with:
enable-cache: true
- name: Detect core directory
id: detect-core-dir
run: |
if [ "${{ inputs.core-directory }}" = "auto" ]; then
if [ -d "apps/core" ]; then
echo "dir=apps/core" >> $GITHUB_OUTPUT
echo "detected=cloud" >> $GITHUB_OUTPUT
else
echo "dir=." >> $GITHUB_OUTPUT
echo "detected=oss" >> $GITHUB_OUTPUT
fi
else
echo "dir=${{ inputs.core-directory }}" >> $GITHUB_OUTPUT
echo "detected=manual" >> $GITHUB_OUTPUT
fi
echo "Using core directory: $(cat $GITHUB_OUTPUT | grep '^dir=' | cut -d'=' -f2)"
- name: Generate cache key
if: inputs.skip-fern-generation != true || (!contains(needs.changed-files.outputs.all_changed_files, 'apps/fern/openapi.json') && !contains(needs.changed-files.outputs.all_changed_files, 'apps/fern/openapi-overrides.yml'))
id: cache-key
run: |
echo "key=sdk-${{ github.ref_name }}-${{ hashFiles('apps/fern/*', 'apps/core/pyproject.toml') }}" >> $GITHUB_OUTPUT
- name: Restore SDK cache
# skip if "skip-fern-generation" is true or if the upstream workflow would've generated an sdk preview (changes to openapi files)
if: inputs.skip-fern-generation != true || (!contains(needs.changed-files.outputs.all_changed_files, 'apps/fern/openapi.json') && !contains(needs.changed-files.outputs.all_changed_files, 'apps/fern/openapi-overrides.yml'))
id: restore-sdk-cache
uses: actions/cache/restore@v4
with:
path: |
apps/fern/.preview/fern-python-sdk/
key: ${{ steps.cache-key.outputs.key }}
fail-on-cache-miss: false
- name: Check SDK cache availability
if: (inputs.skip-fern-generation != true || (!contains(needs.changed-files.outputs.all_changed_files, 'apps/fern/openapi.json') && !contains(needs.changed-files.outputs.all_changed_files, 'apps/fern/openapi-overrides.yml'))) && steps.restore-sdk-cache.outputs.cache-hit != 'true'
run: |
echo "❌ Preview Python SDK cache expired or missing!"
echo "📦 Cache key: ${{ steps.cache-key.outputs.key }}"
echo "🔄 To fix: Re-run the 'preview-python-sdk' workflow job to regenerate the SDK"
echo "💡 This can happen when:"
echo " - The cache entry has expired"
echo " - Dependencies in apps/fern/* or apps/core/pyproject.toml have changed"
echo " - The preview-python-sdk job hasn't run successfully for this branch/commit"
exit 1
- name: Install dependencies with retry
shell: bash
working-directory: ${{ steps.detect-core-dir.outputs.dir }}
run: |
uv sync --no-install-project ${{ inputs.install-args }}
- name: Install custom SDK
if: inputs.skip-fern-generation != true
working-directory: ${{ steps.detect-core-dir.outputs.dir }}
run: |
echo "Fixing Fern SDK pyproject.toml for uv compatibility..."
SDK_PYPROJECT="../fern/.preview/fern-python-sdk/pyproject.toml"
VERSION=$(grep -A 10 '^\[tool\.poetry\]' "$SDK_PYPROJECT" | grep '^version' | head -1 | cut -d'"' -f2)
head -n 2 < ../fern/.preview/fern-python-sdk/pyproject.toml > ../fern/.preview/fern-python-sdk/pyproject.toml.tmp
echo "version = \"$VERSION\"" >> ../fern/.preview/fern-python-sdk/pyproject.toml.tmp
tail -n +3 ../fern/.preview/fern-python-sdk/pyproject.toml >> ../fern/.preview/fern-python-sdk/pyproject.toml.tmp
mv ../fern/.preview/fern-python-sdk/pyproject.toml.tmp ../fern/.preview/fern-python-sdk/pyproject.toml
uv pip install -e ../fern/.preview/fern-python-sdk/.
- name: Migrate database
if: inputs.use-docker != true && inputs.test-type != 'sqlite'
working-directory: ${{ steps.detect-core-dir.outputs.dir }}
env:
LETTA_PG_PORT: 5432
LETTA_PG_USER: postgres
LETTA_PG_PASSWORD: postgres
LETTA_PG_DB: postgres
LETTA_PG_HOST: localhost
run: |
psql -h localhost -U postgres -d postgres -c 'CREATE EXTENSION vector'
uv run alembic upgrade head
- name: Inject env vars into environment
working-directory: ${{ steps.detect-core-dir.outputs.dir }}
run: |
# Get secrets and mask them before adding to environment
while IFS= read -r line || [[ -n "$line" ]]; do
if [[ -n "$line" ]]; then
value=$(echo "$line" | cut -d= -f2-)
echo "::add-mask::$value"
echo "$line" >> $GITHUB_ENV
fi
done < <(letta_secrets_helper --env dev --service ci)
- name: Docker setup for Docker tests
if: inputs.use-docker
run: |
mkdir -p /home/ci-runner/.letta/logs
sudo chown -R $USER:$USER /home/ci-runner/.letta/logs
chmod -R 755 /home/ci-runner/.letta/logs
- name: Build and run docker dev server
if: inputs.use-docker
env:
LETTA_PG_DB: letta
LETTA_PG_USER: letta
LETTA_PG_PASSWORD: letta
LETTA_PG_PORT: 5432
OPENAI_API_KEY: ${{ env.OPENAI_API_KEY }}
run: |
cd libs/config-core-deploy
docker compose -f compose.yaml up --build -d
- name: Wait for Docker service
if: inputs.use-docker
working-directory: ${{ steps.detect-core-dir.outputs.dir }}
run: |
bash scripts/wait_for_service.sh localhost:8083 -- echo "Service is ready"
- name: Run tests
working-directory: ${{ steps.detect-core-dir.outputs.dir }}
env:
# Database configuration (shared, but values depend on Docker usage)
LETTA_PG_PORT: 5432
LETTA_PG_USER: ${{ inputs.use-docker && 'letta' || 'postgres' }}
LETTA_PG_PASSWORD: ${{ inputs.use-docker && 'letta' || 'postgres' }}
LETTA_PG_DB: ${{ inputs.use-docker && 'letta' || 'postgres' }}
LETTA_PG_HOST: localhost
# Server configuration (conditional)
LETTA_SERVER_PASS: test_server_token
# LLM Provider API Keys (shared across all test types)
OPENAI_API_KEY: ${{ env.OPENAI_API_KEY }}
ANTHROPIC_API_KEY: ${{ env.ANTHROPIC_API_KEY }}
GEMINI_API_KEY: ${{ env.GEMINI_API_KEY }}
GROQ_API_KEY: ${{ env.GROQ_API_KEY }}
AZURE_API_KEY: ${{ env.AZURE_API_KEY }}
AZURE_BASE_URL: ${{ secrets.AZURE_BASE_URL }}
DEEPSEEK_API_KEY: ${{ env.DEEPSEEK_API_KEY }}
LETTA_MISTRAL_API_KEY: ${{ secrets.LETTA_MISTRAL_API_KEY }}
# External service API Keys (shared across all test types)
COMPOSIO_API_KEY: ${{ env.COMPOSIO_API_KEY }}
E2B_API_KEY: ${{ env.E2B_API_KEY }}
E2B_SANDBOX_TEMPLATE_ID: ${{ env.E2B_SANDBOX_TEMPLATE_ID }}
TAVILY_API_KEY: ${{ secrets.TAVILY_API_KEY }}
PINECONE_API_KEY: ${{ secrets.PINECONE_API_KEY }}
PINECONE_INDEX_HOST: ${{ secrets.PINECONE_INDEX_HOST }}
PINECONE_NAMESPACE: ${{ secrets.PINECONE_NAMESPACE }}
# Turbopuffer flags
LETTA_USE_TPUF: true
LETTA_TPUF_API_KEY: ${{ env.LETTA_TPUF_API_KEY }}
# Encryption key
LETTA_ENCRYPTION_KEY: ${{ env.LETTA_ENCRYPTION_KEY }}
# Google Cloud (shared across all test types)
GOOGLE_CLOUD_PROJECT: ${{ secrets.GOOGLE_CLOUD_PROJECT }}
GOOGLE_CLOUD_LOCATION: ${{ secrets.GOOGLE_CLOUD_LOCATION }}
# Feature flags (shared across all test types)
LETTA_ENABLE_BATCH_JOB_POLLING: true
LETTA_GEMINI_FORCE_MINIMUM_THINKING_BUDGET: true
LETTA_GEMINI_MAX_RETRIES: 10
# Pinecone flags
LETTA_PINECONE_API_KEY: ${{ secrets.LETTA_PINECONE_API_KEY }}
LETTA_ENABLE_PINECONE: ${{ secrets.LETTA_ENABLE_PINECONE }}
EXA_API_KEY: ${{ env.EXA_API_KEY }}
# Docker-specific environment variables
PYTHONPATH: ${{ inputs.use-docker && format('{0}:{1}', github.workspace, env.PYTHONPATH) || '' }}
LETTA_REDIS_HOST: localhost
run: |
set -o xtrace
# Set LETTA_SERVER_URL only for Docker tests
if [[ "${{ inputs.use-docker }}" == "true" ]]; then
export LETTA_SERVER_URL="http://localhost:8083"
fi
# Set LLM_CONFIG_FILE only for send-message tests
if [[ "${{ inputs.test-type }}" == "send-message" ]]; then
export LLM_CONFIG_FILE="${{ matrix.config_file }}"
fi
# Set Ollama base URL only for Ollama tests
if [[ "${{ inputs.test-type }}" == "integration" && "${{ inputs.runner }}" == *"ollama"* ]]; then
export LLM_CONFIG_FILE="ollama.json"
export OLLAMA_BASE_URL="http://localhost:11434"
fi
# Set LMStudio base URL only for LMStudio tests
if [[ "${{ inputs.test-type }}" == "integration" && "${{ inputs.runner }}" == *"lmstudio"* ]]; then
export LLM_CONFIG_FILE="lmstudio.json"
export LMSTUDIO_BASE_URL="http://localhost:1234"
fi
# Set VLLM base URL only for VLLM tests
if [[ "${{ inputs.test-type }}" == "integration" && "${{ inputs.runner }}" == *"vllm"* ]]; then
export LLM_CONFIG_FILE="vllm.json"
export VLLM_BASE_URL="http://localhost:8000"
fi
uv pip install pytest-github-actions-annotate-failures
# Handle different matrix variable names and test commands based on test type
if [[ "${{ inputs.test-type }}" == "integration" ]]; then
uv pip install letta
uv pip show letta
uv pip show letta-client
uv run --frozen pytest -svv ${{ inputs.test-path-prefix }}${{ matrix.test_suite }}
elif [[ "${{ inputs.test-type }}" == "unit" ]]; then
uv pip show letta-client
uv run --frozen pytest -svv ${{ inputs.test-path-prefix }}${{ matrix.test_suite }}
elif [[ "${{ inputs.test-type }}" == "send-message" ]]; then
uv run --frozen pytest -s -vv tests/integration_test_send_message.py --maxfail=1 --durations=10
elif [[ "${{ inputs.test-type }}" == "docker" ]]; then
uv run --frozen pytest -s tests/test_client.py
elif [[ "${{ inputs.test-type }}" == "sqlite" ]]; then
# force sqlite
unset LETTA_PG_USER
unset LETTA_PG_PASSWORD
unset LETTA_PG_DB
unset LETTA_PG_HOST
uv pip show letta-client
uv run alembic upgrade head
uv run --frozen pytest -svv ${{ inputs.test-path-prefix }}${{ matrix.test_suite }}
else
${{ inputs.test-command }}
fi
- name: Remove sqlite db
if: ${{ always() && inputs.test-type == 'sqlite' }}
run: sudo rm -rf ~/.letta || true
- name: Print docker logs if tests fail
if: ${{ (failure() || cancelled()) && inputs.use-docker }}
working-directory: libs/config-core-deploy
run: |
echo "Printing Docker Logs..."
docker compose -f compose.yaml logs
- name: Stop docker
if: ${{ always() && inputs.use-docker }}
working-directory: libs/config-core-deploy
run: |
docker compose -f compose.yaml down --volumes
sudo rm -rf .persist

View File

@@ -1,157 +0,0 @@
name: Send Message SDK Tests
on:
pull_request_target:
branches: [main] # TODO: uncomment before merge
types: [labeled]
paths:
- 'letta/**'
jobs:
send-messages:
# Only run when the "safe to test" label is applied
if: contains(github.event.pull_request.labels.*.name, 'safe to test')
runs-on: ubuntu-latest
timeout-minutes: 15
strategy:
fail-fast: false
matrix:
config_file:
- "openai-gpt-4o-mini.json"
- "azure-gpt-4o-mini.json"
- "claude-3-5-sonnet.json"
- "claude-4-sonnet-extended.json"
- "claude-3-7-sonnet-extended.json"
- "gemini-pro.json"
- "gemini-vertex.json"
services:
qdrant:
image: qdrant/qdrant
ports:
- 6333:6333
postgres:
image: pgvector/pgvector:pg17
ports:
- 5432:5432
env:
POSTGRES_HOST_AUTH_METHOD: trust
POSTGRES_DB: postgres
POSTGRES_USER: postgres
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
redis:
image: redis:7
ports:
- 6379:6379
options: >-
--health-cmd "redis-cli ping"
--health-interval 5s
--health-timeout 5s
--health-retries 10
steps:
# Ensure secrets don't leak
- name: Configure git to hide secrets
run: |
git config --global core.logAllRefUpdates false
git config --global log.hideCredentials true
- name: Set up secret masking
run: |
# Automatically mask any environment variable ending with _KEY
for var in $(env | grep '_KEY=' | cut -d= -f1); do
value="${!var}"
if [[ -n "$value" ]]; then
# Mask the full value
echo "::add-mask::$value"
# Also mask partial values (first and last several characters)
# This helps when only parts of keys appear in logs
if [[ ${#value} -gt 8 ]]; then
echo "::add-mask::${value:0:8}"
echo "::add-mask::${value:(-8)}"
fi
# Also mask with common formatting changes
# Some logs might add quotes or other characters
echo "::add-mask::\"$value\""
echo "::add-mask::$value\""
echo "::add-mask::\"$value"
echo "Masked secret: $var (length: ${#value})"
fi
done
# Check out base repository code, not the PR's code (for security)
- name: Checkout base repository
uses: actions/checkout@v4 # No ref specified means it uses base branch
# Only extract relevant files from the PR (for security, specifically prevent modification of workflow files)
- name: Extract PR schema files
run: |
# Fetch PR without checking it out
git fetch origin pull/${{ github.event.pull_request.number }}/head:pr-${{ github.event.pull_request.number }}
# Extract ONLY the schema files
git checkout pr-${{ github.event.pull_request.number }} -- letta/
- name: Set up python 3.12
id: setup-python
uses: actions/setup-python@v5
with:
python-version: 3.12
- name: Install uv
uses: astral-sh/setup-uv@v4
with:
version: "latest"
- name: Load cached venv
id: cached-uv-dependencies
uses: actions/cache@v4
with:
path: .venv
key: venv-${{ runner.os }}-${{ steps.setup-python.outputs.python-version }}-${{ hashFiles('**/uv.lock') }}
restore-keys: |
venv-${{ runner.os }}-${{ steps.setup-python.outputs.python-version }}-
- name: Install dependencies
if: steps.cached-uv-dependencies.outputs.cache-hit != 'true'
shell: bash
run: uv sync --extra dev --extra postgres --extra external-tools --extra cloud-tool-sandbox --extra google
- name: Install letta packages
run: |
uv run pip install --upgrade letta-client letta
- name: Migrate database
env:
LETTA_PG_PORT: 5432
LETTA_PG_USER: postgres
LETTA_PG_PASSWORD: postgres
LETTA_PG_DB: postgres
LETTA_PG_HOST: localhost
run: |
psql -h localhost -U postgres -d postgres -c 'CREATE EXTENSION vector'
uv run alembic upgrade head
- name: Run integration tests for ${{ matrix.config_file }}
env:
LLM_CONFIG_FILE: ${{ matrix.config_file }}
LETTA_PG_PORT: 5432
LETTA_PG_USER: postgres
LETTA_PG_PASSWORD: postgres
LETTA_PG_DB: postgres
LETTA_PG_HOST: localhost
LETTA_REDIS_HOST: localhost
LETTA_REDIS_PORT: 6379
LETTA_SERVER_PASS: test_server_token
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
AZURE_API_KEY: ${{ secrets.AZURE_API_KEY }}
AZURE_BASE_URL: ${{ secrets.AZURE_BASE_URL }}
GEMINI_API_KEY: ${{ secrets.GEMINI_API_KEY }}
COMPOSIO_API_KEY: ${{ secrets.COMPOSIO_API_KEY }}
DEEPSEEK_API_KEY: ${{ secrets.DEEPSEEK_API_KEY }}
GOOGLE_CLOUD_PROJECT: ${{ secrets.GOOGLE_CLOUD_PROJECT }}
GOOGLE_CLOUD_LOCATION: ${{ secrets.GOOGLE_CLOUD_LOCATION }}
LETTA_GEMINI_FORCE_MINIMUM_THINKING_BUDGET: true
run: |
uv run pytest \
-s -vv \
tests/integration_test_send_message.py \
--maxfail=1 --durations=10

View File

@@ -1,48 +0,0 @@
name: 🐍🧪 [Core] Send Message SDK Tests
on:
pull_request:
branches:
- main
pull_request_target:
branches:
- main
types: [labeled]
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
jobs:
send-message-tests:
# Run on pull_request OR on pull_request_target only when labeled "safe to test"
if: github.event_name == 'pull_request' || (github.event_name == 'pull_request_target' && contains(github.event.pull_request.labels.*.name, 'safe to test'))
uses: ./.github/workflows/reusable-test-workflow.yml
with:
test-type: 'send-message'
changed-files-pattern: |
apps/core/**
.github/workflows/reusable-test-workflow.yml
.github/workflows/send-message-integration-tests.yml
install-args: '--extra dev --extra postgres --extra external-tools --extra cloud-tool-sandbox --extra google --extra redis'
timeout-minutes: 15
runner: '["self-hosted", "medium"]'
ref: ${{ github.event.pull_request.head.sha || github.sha }}
use-redis: true
# TODO: "azure-gpt-4o-mini.json" add back later, getting content violation
matrix-strategy: |
{
"fail-fast": false,
"matrix": {
"config_file": [
"openai-gpt-4o-mini.json",
"claude-4-sonnet-extended.json",
"claude-3-5-sonnet.json",
"claude-3-7-sonnet-extended.json",
"gemini-1.5-pro.json",
"gemini-2.5-pro.json",
"gemini-2.5-flash.json"
]
}
}
secrets: inherit

View File

@@ -1,47 +0,0 @@
name: Self-Hosted Provider Integration - LMStudio
on:
workflow_dispatch:
# inputs:
# ref:
# description: 'Git ref to test'
# required: false
# type: string
# default: ${{ github.sha || github.ref || github.event.pull_request.head.sha }}
pull_request:
paths:
- 'apps/core/**'
- '.github/workflows/test-lmstudio.yml'
- '.github/workflows/reusable-test-workflow.yml'
pull_request_target:
types: [labeled]
paths:
- 'apps/core/**'
- '.github/workflows/test-lmstudio.yml'
- '.github/workflows/reusable-test-workflow.yml'
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
jobs:
test-lmstudio:
# Run on pull_request OR on pull_request_target only when labeled "safe to test"
if: github.event_name == 'workflow_dispatch' || github.event_name == 'pull_request' || (github.event_name == 'pull_request_target' && contains(github.event.pull_request.labels.*.name, 'safe to test'))
uses: ./.github/workflows/reusable-test-workflow.yml
with:
test-type: "integration"
install-args: "--extra postgres --extra external-tools --extra dev --extra cloud-tool-sandbox --extra google"
test-command: "uv run pytest -svv tests/"
timeout-minutes: 60
runner: '["self-hosted", "gpu", "lmstudio"]'
matrix-strategy: |
{
"fail-fast": false,
"matrix": {
"test_suite": [
"integration_test_send_message.py"
]
}
}
secrets: inherit

View File

@@ -1,48 +0,0 @@
name: Self-Hosted Provider Integration - Ollama
on:
workflow_dispatch:
# inputs:
# ref:
# description: 'Git ref to test'
# required: false
# type: string
# default: ${{ github.sha || github.ref || github.event.pull_request.head.sha }}
pull_request:
paths:
- 'apps/core/**'
- '.github/workflows/test-ollama.yml'
- '.github/workflows/reusable-test-workflow.yml'
pull_request_target:
types: [labeled]
paths:
- 'apps/core/**'
- '.github/workflows/test-ollama.yml'
- '.github/workflows/reusable-test-workflow.yml'
concurrency:
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
jobs:
test-ollama:
# Run on pull_request OR on pull_request_target only when labeled "safe to test"
if: github.event_name == 'workflow_dispatch' || github.event_name == 'pull_request' || (github.event_name == 'pull_request_target' && contains(github.event.pull_request.labels.*.name, 'safe to test'))
uses: ./.github/workflows/reusable-test-workflow.yml
with:
test-type: "integration"
install-args: "--extra postgres --extra external-tools --extra dev --extra cloud-tool-sandbox --extra google"
test-command: "uv run --frozen pytest -svv tests/"
timeout-minutes: 60
runner: '["self-hosted", "gpu", "ollama"]'
matrix-strategy: |
{
"fail-fast": false,
"matrix": {
"test_suite": [
"test_providers.py::test_ollama",
"integration_test_send_message.py"
]
}
}
secrets: inherit

View File

@@ -1,23 +0,0 @@
name: Test Package Installation
on: [push, pull_request, workflow_dispatch]
jobs:
test-install:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.11", "3.12", "3.13"] # Adjust Python versions as needed
steps:
- uses: actions/checkout@v2
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Install package with extras
run: pip install '.[external-tools,postgres,dev,server,ollama]' # Replace 'all' with the key that includes all extras
- name: Check package installation
run: pip list # Or any other command to verify successful installation

View File

@@ -1,44 +0,0 @@
name: Self-Hosted Provider Integration - vLLM
on:
workflow_dispatch:
# inputs:
# ref:
# description: 'Git ref to test'
# required: false
# type: string
# default: ${{ github.sha || github.ref || github.event.pull_request.head.sha }}
pull_request:
paths:
- 'apps/core/**'
- '.github/workflows/test-vllm.yml'
- '.github/workflows/reusable-test-workflow.yml'
pull_request_target:
types: [labeled]
paths:
- 'apps/core/**'
- '.github/workflows/test-vllm.yml'
- '.github/workflows/reusable-test-workflow.yml'
jobs:
test-vllm:
# Run on pull_request OR on pull_request_target only when labeled "safe to test"
if: github.event_name == 'workflow_dispatch' || github.event_name == 'pull_request' || (github.event_name == 'pull_request_target' && contains(github.event.pull_request.labels.*.name, 'safe to test'))
uses: ./.github/workflows/reusable-test-workflow.yml
with:
test-type: "integration"
install-args: "--extra postgres --extra external-tools --extra dev --extra cloud-tool-sandbox --extra google"
test-command: "uv run --frozen pytest -svv tests/"
timeout-minutes: 60
runner: '["self-hosted", "gpu", "vllm"]'
matrix-strategy: |
{
"fail-fast": false,
"matrix": {
"test_suite": [
"test_providers.py::test_vllm",
"integration_test_send_message.py"
]
}
}
secrets: inherit

View File

@@ -1,63 +0,0 @@
name: Check uv Dependencies Changes
on:
pull_request:
paths:
- 'uv.lock'
- 'pyproject.toml'
jobs:
check-uv-changes:
runs-on: ubuntu-latest
permissions:
pull-requests: write
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Check for uv.lock changes
id: check-uv-lock
run: |
if git diff --name-only ${{ github.event.pull_request.base.sha }} ${{ github.event.pull_request.head.sha }} | grep -q "uv.lock"; then
echo "uv_lock_changed=true" >> $GITHUB_OUTPUT
else
echo "uv_lock_changed=false" >> $GITHUB_OUTPUT
fi
- name: Check for pyproject.toml changes
id: check-pyproject
run: |
if git diff --name-only ${{ github.event.pull_request.base.sha }} ${{ github.event.pull_request.head.sha }} | grep -q "pyproject.toml"; then
echo "pyproject_changed=true" >> $GITHUB_OUTPUT
else
echo "pyproject_changed=false" >> $GITHUB_OUTPUT
fi
- name: Create PR comment
if: steps.check-uv-lock.outputs.uv_lock_changed == 'true' || steps.check-pyproject.outputs.pyproject_changed == 'true'
uses: actions/github-script@v7
with:
script: |
const uvLockChanged = ${{ steps.check-uv-lock.outputs.uv_lock_changed }};
const pyprojectChanged = ${{ steps.check-pyproject.outputs.pyproject_changed }};
let message = '📦 Dependencies Alert:\n\n';
if (uvLockChanged && pyprojectChanged) {
message += '- Both `uv.lock` and `pyproject.toml` have been modified\n';
} else if (uvLockChanged) {
message += '- `uv.lock` has been modified\n';
} else if (pyprojectChanged) {
message += '- `pyproject.toml` has been modified\n';
}
message += '\nPlease review these changes carefully to ensure they are intended (cc @sarahwooders @cpacker).';
github.rest.issues.createComment({
issue_number: context.issue.number,
owner: context.repo.owner,
repo: context.repo.repo,
body: message
});