feat: add background task notification system (#827)

Co-authored-by: Letta <noreply@letta.com>
This commit is contained in:
Charles Packer
2026-02-04 22:45:16 -08:00
committed by GitHub
parent 84e9a6d744
commit 48ccd8f220
44 changed files with 2244 additions and 234 deletions

View File

@@ -9,3 +9,6 @@ Use this tool when you need to ask the user questions during execution. This all
Usage notes:
- Users will always be able to select "Other" to provide custom text input
- Use multiSelect: true to allow multiple answers to be selected for a question
- If you recommend a specific option, make that the first option in the list and add "(Recommended)" at the end of the label
Plan mode note: In plan mode, use this tool to clarify requirements or choose between approaches BEFORE finalizing your plan. Do NOT use this tool to ask "Is my plan ready?" or "Should I proceed?" - use ExitPlanMode for plan approval.

View File

@@ -23,9 +23,9 @@ Before executing the command, please follow these steps:
Usage notes:
- The command argument is required.
- You can specify an optional timeout in milliseconds (up to 600000ms / 10 minutes). If not specified, commands will timeout after 120000ms (2 minutes).
- It is very helpful if you write a clear, concise description of what this command does in 5-10 words.
- It is very helpful if you write a clear, concise description of what this command does. For simple commands, keep it brief (5-10 words). For complex commands (piped commands, obscure flags, or anything hard to understand at a glance), add enough context to clarify what it does.
- If the output exceeds 30000 characters, output will be truncated before being returned to you.
- You can use the `run_in_background` parameter to run the command in the background, which allows you to continue working while the command runs. You can monitor the output using the Bash tool as it becomes available. You do not need to use '&' at the end of the command when using this parameter.
- You can use the `run_in_background` parameter to run the command in the background. Only use this if you don't need the result immediately and are OK being notified when the command completes later. You do not need to check the output right away - you'll be notified when it finishes. You do not need to use '&' at the end of the command when using this parameter.
- Avoid using Bash with the `find`, `grep`, `cat`, `head`, `tail`, `sed`, `awk`, or `echo` commands, unless explicitly instructed or when these commands are truly necessary for the task. Instead, always prefer using the dedicated tools for these commands:
- File search: Use Glob (NOT find or ls)
@@ -56,12 +56,12 @@ Git Safety Protocol:
- NEVER run destructive/irreversible git commands (like push --force, hard reset, etc) unless the user explicitly requests them
- NEVER skip hooks (--no-verify, --no-gpg-sign, etc) unless the user explicitly requests it
- NEVER run force push to main/master, warn the user if they request it
- Avoid git commit --amend. ONLY use --amend when either (1) user explicitly requested amend OR (2) adding edits from pre-commit hook (additional instructions below)
- Before amending: ALWAYS check authorship (git log -1 --format='%an %ae')
- CRITICAL: Always create NEW commits rather than amending, unless the user explicitly requests a git amend. When a pre-commit hook fails, the commit did NOT happen — so --amend would modify the PREVIOUS commit, which may result in destroying work or losing previous changes. Instead, after hook failure, fix the issue, re-stage, and create a NEW commit
- When staging files, prefer adding specific files by name rather than using "git add -A" or "git add .", which can accidentally include sensitive files (.env, credentials) or large binaries
- NEVER commit changes unless the user explicitly asks you to. It is VERY IMPORTANT to only commit when explicitly asked, otherwise the user will feel that you are being too proactive.
1. You can call multiple tools in a single response. When multiple independent pieces of information are requested and all commands are likely to succeed, run multiple tool calls in parallel for optimal performance. run the following bash commands in parallel, each using the Bash tool:
- Run a git status command to see all untracked files.
- Run a git status command to see all untracked files. IMPORTANT: Never use the -uall flag as it can cause memory issues on large repos.
- Run a git diff command to see both staged and unstaged changes that will be committed.
- Run a git log command to see recent commit messages, so that you can follow this repository's commit message style.
2. Analyze all staged changes (both previously staged and newly added) and draft a commit message:
@@ -77,16 +77,14 @@ Git Safety Protocol:
Co-Authored-By: Letta <noreply@letta.com>
- Run git status after the commit completes to verify success.
Note: git status depends on the commit completing, so run it sequentially after the commit.
4. If the commit fails due to pre-commit hook changes, retry ONCE. If it succeeds but files were modified by the hook, verify it's safe to amend:
- Check authorship: git log -1 --format='%an %ae'
- Check not pushed: git status shows "Your branch is ahead"
- If both true: amend your commit. Otherwise: create NEW commit (never amend other developers' commits)
4. If the commit fails due to pre-commit hook: fix the issue and create a NEW commit
Important notes:
- NEVER run additional commands to read or explore code, besides git bash commands
- NEVER use the TodoWrite or Task tools
- DO NOT push to the remote repository unless the user explicitly asks you to do so
- IMPORTANT: Never use git commands with the -i flag (like git rebase -i or git add -i) since they require interactive input which is not supported.
- IMPORTANT: Do not use --no-edit with git rebase commands, as the --no-edit flag is not a valid option for git rebase.
- If there are no changes to commit (i.e., no untracked files and no modifications), do not create an empty commit
- In order to ensure good formatting, ALWAYS pass the commit message via a HEREDOC, a la this example:
<example>

View File

@@ -1,39 +1,46 @@
# EnterPlanMode
Use this tool when you encounter a complex task that requires careful planning and exploration before implementation. This tool transitions you into plan mode where you can thoroughly explore the codebase and design an implementation approach.
Use this tool proactively when you're about to start a non-trivial implementation task. Getting user sign-off on your approach before writing code prevents wasted effort and ensures alignment. This tool transitions you into plan mode where you can explore the codebase and design an implementation approach for user approval.
## When to Use This Tool
Use EnterPlanMode when ANY of these conditions apply:
**Prefer using EnterPlanMode** for implementation tasks unless they're simple. Use it when ANY of these conditions apply:
1. **Multiple Valid Approaches**: The task can be solved in several different ways, each with trade-offs
1. **New Feature Implementation**: Adding meaningful new functionality
- Example: "Add a logout button" - where should it go? What should happen on click?
- Example: "Add form validation" - what rules? What error messages?
2. **Multiple Valid Approaches**: The task can be solved in several different ways
- Example: "Add caching to the API" - could use Redis, in-memory, file-based, etc.
- Example: "Improve performance" - many optimization strategies possible
2. **Significant Architectural Decisions**: The task requires choosing between architectural patterns
3. **Code Modifications**: Changes that affect existing behavior or structure
- Example: "Update the login flow" - what exactly should change?
- Example: "Refactor this component" - what's the target architecture?
4. **Architectural Decisions**: The task requires choosing between patterns or technologies
- Example: "Add real-time updates" - WebSockets vs SSE vs polling
- Example: "Implement state management" - Redux vs Context vs custom solution
3. **Large-Scale Changes**: The task touches many files or systems
5. **Multi-File Changes**: The task will likely touch more than 2-3 files
- Example: "Refactor the authentication system"
- Example: "Migrate from REST to GraphQL"
- Example: "Add a new API endpoint with tests"
4. **Unclear Requirements**: You need to explore before understanding the full scope
6. **Unclear Requirements**: You need to explore before understanding the full scope
- Example: "Make the app faster" - need to profile and identify bottlenecks
- Example: "Fix the bug in checkout" - need to investigate root cause
5. **User Input Needed**: You'll need to ask clarifying questions before starting
- If you would use AskUserQuestion to clarify the approach, consider EnterPlanMode instead
7. **User Preferences Matter**: The implementation could reasonably go multiple ways
- If you would use AskUserQuestion to clarify the approach, use EnterPlanMode instead
- Plan mode lets you explore first, then present options with context
## When NOT to Use This Tool
Do NOT use EnterPlanMode for:
- Simple, straightforward tasks with obvious implementation
- Small bug fixes where the solution is clear
- Adding a single function or small feature
- Tasks you're already confident how to implement
- Research-only tasks (use the Task tool with explore agent instead)
Only skip EnterPlanMode for simple tasks:
- Single-line or few-line fixes (typos, obvious bugs, small tweaks)
- Adding a single function with clear requirements
- Tasks where the user has given very specific, detailed instructions
- Pure research/exploration tasks (use the Task tool with explore agent instead)
## What Happens in Plan Mode
@@ -49,7 +56,7 @@ In plan mode, you'll:
### GOOD - Use EnterPlanMode:
User: "Add user authentication to the app"
- This requires architectural decisions (session vs JWT, where to store tokens, middleware structure)
- Requires architectural decisions (session vs JWT, where to store tokens, middleware structure)
User: "Optimize the database queries"
- Multiple approaches possible, need to profile first, significant impact
@@ -57,6 +64,12 @@ User: "Optimize the database queries"
User: "Implement dark mode"
- Architectural decision on theme system, affects many components
User: "Add a delete button to the user profile"
- Seems simple but involves: where to place it, confirmation dialog, API call, error handling, state updates
User: "Update the error handling in the API"
- Affects multiple files, user should approve the approach
### BAD - Don't use EnterPlanMode:
User: "Fix the typo in the README"
- Straightforward, no planning needed
@@ -70,6 +83,5 @@ User: "What files handle routing?"
## Important Notes
- This tool REQUIRES user approval - they must consent to entering plan mode
- Be thoughtful about when to use it - unnecessary plan mode slows down simple tasks
- If unsure whether to use it, err on the side of starting implementation
- You can always ask the user "Would you like me to plan this out first?"
- If unsure whether to use it, err on the side of planning - it's better to get alignment upfront than to redo work
- Users appreciate being consulted before significant changes are made to their codebase

View File

@@ -11,13 +11,12 @@ Use this tool when you are in plan mode and have finished writing your plan to t
## When to Use This Tool
IMPORTANT: Only use this tool when the task requires planning the implementation steps of a task that requires writing code. For research tasks where you're gathering information, searching files, reading files or in general trying to understand the codebase - do NOT use this tool.
## Handling Ambiguity in Plans
Before using this tool, ensure your plan is clear and unambiguous. If there are multiple valid approaches or unclear requirements:
1. Use the AskUserQuestion tool to clarify with the user
2. Ask about specific implementation choices (e.g., architectural patterns, which library to use)
3. Clarify any assumptions that could affect the implementation
4. Edit your plan file to incorporate user feedback
5. Only proceed with ExitPlanMode after resolving ambiguities and updating the plan file
## Before Using This Tool
Ensure your plan is complete and unambiguous:
- If you have unresolved questions about requirements or approach, use AskUserQuestion first (in earlier phases)
- Once your plan is finalized, use THIS tool to request approval
**Important:** Do NOT use AskUserQuestion to ask "Is this plan okay?" or "Should I proceed?" - that's exactly what THIS tool does. ExitPlanMode inherently requests user approval of your plan.
## Examples

View File

@@ -4,6 +4,6 @@
- Supports glob patterns like "**/*.js" or "src/**/*.ts"
- Returns matching file paths sorted by modification time
- Use this tool when you need to find files by name patterns
- When you are doing an open ended search that may require multiple rounds of globbing and grepping, use the Agent tool instead
- When you are doing an open ended search that may require multiple rounds of globbing and grepping, use the Task tool instead
- You have the capability to call multiple tools in a single response. It is always better to speculatively perform multiple searches as a batch that are potentially useful.
- If more than 2,000 files match the pattern, only the first 2,000 will be returned

View File

@@ -9,8 +9,8 @@ Usage:
- You can optionally specify a line offset and limit (especially handy for long files), but it's recommended to read the whole file by not providing these parameters
- Any lines longer than 2000 characters will be truncated
- Results are returned using cat -n format, with line numbers starting at 1
- This tool allows Letta Code to read images (PNG, JPG, JPEG, GIF, WEBP, BMP). When reading an image file the contents are presented visually as Letta Code is a multimodal LLM. Large images are automatically resized to fit within API limits.
- You will regularly be asked to read screenshots. If the user provides a path to a screenshot, ALWAYS use this tool to view the file at the path. This tool will work with all temporary file paths.
- This tool can only read files, not directories. To read a directory, use the ls command via Bash.
- This tool allows Letta Code to read images (eg PNG, JPG, etc). When reading an image file the contents are presented visually as Letta Code is a multimodal LLM.
- This tool can only read files, not directories. To read a directory, use an ls command via the Bash tool.
- You can call multiple tools in a single response. It is always better to speculatively read multiple potentially useful files in parallel.
- You will regularly be asked to read screenshots. If the user provides a path to a screenshot, ALWAYS use this tool to view the file at the path. This tool will work with all temporary file paths.
- If you read a file that exists but has empty contents you will receive a system reminder warning in place of file contents.

View File

@@ -4,54 +4,36 @@ Launch a new agent to handle complex, multi-step tasks autonomously.
The Task tool launches specialized agents (subprocesses) that autonomously handle complex tasks. Each agent type has specific capabilities and tools available to it.
## Usage
When using the Task tool, you must specify a subagent_type parameter to select which agent type to use.
The Task tool supports two commands:
## When NOT to use the Task tool:
### Run (default)
Launch a subagent to perform a task. Parameters:
- **subagent_type**: Which specialized agent to use (see Available Agents section)
- **prompt**: Detailed, self-contained instructions for the agent (agents cannot ask questions mid-execution)
- **description**: Short 3-5 word summary for tracking
- **model** (optional): Override the model for this agent
- **agent_id** (optional): Deploy an existing agent instead of creating a new one
- **conversation_id** (optional): Resume from an existing conversation
- If you want to read a specific file path, use the Read or Glob tool instead of the Task tool, to find the match more quickly
- If you are searching for a specific class definition like "class Foo", use the Glob tool instead, to find the match more quickly
- If you are searching for code within a specific file or set of 2-3 files, use the Read tool instead of the Task tool, to find the match more quickly
- Other tasks that are not related to the agent descriptions above
### Refresh
Re-scan the `.letta/agents/` directories to discover new or updated custom subagents:
```typescript
Task({ command: "refresh" })
```
Use this after creating or modifying custom subagent definitions.
## Usage notes:
## When to use this tool:
- **Codebase exploration**: Use when you need to search for files, understand code structure, or find specific patterns
- **Complex tasks**: Use when a task requires multiple steps and autonomous decision-making
- **Research tasks**: Use when you need to gather information from the codebase
- **Parallel work**: Launch multiple agents concurrently for independent tasks
## When NOT to use this tool:
- If you need to read a specific file path, use Read tool directly
- If you're searching for a specific class definition, use Glob tool directly
- If you're searching within 2-3 specific files, use Read tool directly
- For simple, single-step operations
## Important notes:
- **Stateless**: Each agent invocation is autonomous and returns a single final report
- **No back-and-forth**: You cannot communicate with agents during execution
- **Front-load instructions**: Provide complete task details upfront
- **Context-aware**: Agents see full conversation history and can reference earlier context
- **Parallel execution**: Launch multiple agents concurrently by calling Task multiple times in a single response
- **Specify return format**: Tell agents exactly what information to include in their report
- Always include a short description (3-5 words) summarizing what the agent will do
- Launch multiple agents concurrently whenever possible, to maximize performance; to do that, use a single message with multiple tool uses
- When the agent is done, it will return a single message back to you. The result returned by the agent is not visible to the user. To show the user the result, you should send a text message back to the user with a concise summary of the result.
- You can optionally run agents in the background using the run_in_background parameter. When an agent runs in the background, the tool result will include an output_file path. To check on the agent's progress or retrieve its results, use the Read tool to read the output file, or use Bash with `tail` to see recent output. You can continue working while background agents run.
- Agents can be resumed using the `conversation_id` parameter by passing the conversation ID from a previous invocation. When resumed, the agent continues with its full previous context preserved.
- When the agent is done, it will return a single message back to you along with its conversation ID. You can use this ID to resume the agent later if needed for follow-up work.
- Provide clear, detailed prompts so the agent can work autonomously and return exactly the information you need.
- Agents with "access to current context" can see the full conversation history before the tool call. When using these agents, you can write concise prompts that reference earlier context (e.g., "investigate the error discussed above") instead of repeating information. The agent will receive all prior messages and understand the context.
- The agent's outputs should generally be trusted
- Clearly tell the agent whether you expect it to write code or just to do research (search, file reads, web fetches, etc.), since it is not aware of the user's intent
- If the agent description mentions that it should be used proactively, then you should try your best to use it without the user having to ask for it first. Use your judgement.
- If the user specifies that they want you to run agents "in parallel", you MUST send a single message with multiple Task tool use content blocks. For example, if you need to launch multiple agents in parallel, send a single message with multiple Task tool calls.
## Deploying an Existing Agent
Instead of spawning a fresh subagent from a template, you can deploy an existing agent to work in your local codebase.
### Access Levels (subagent_type)
Use subagent_type to control what tools the deployed agent can access:
- **explore**: Read-only access (Read, Glob, Grep) - safer for exploration tasks
- **general-purpose**: Full read-write access (Bash, Edit, Write, etc.) - for implementation tasks
@@ -86,6 +68,7 @@ Task({
// Deploy agent with full access (default)
Task({
agent_id: "agent-abc123",
subagent_type: "general-purpose",
description: "Fix auth bug",
prompt: "Fix the bug in auth.ts"
})
@@ -98,15 +81,14 @@ Task({
})
```
## Examples:
## Example usage:
```typescript
// Good - specific and actionable with a user-specified model "gpt-5-low"
// Good - specific and actionable
Task({
subagent_type: "explore",
description: "Find authentication code",
prompt: "Search for all authentication-related code in src/. List file paths and the main auth approach used.",
model: "gpt-5-low"
prompt: "Search for all authentication-related code in src/. List file paths and the main auth approach used."
})
// Good - complex multi-step task
@@ -116,7 +98,7 @@ Task({
prompt: "Add email and password validation to the user registration form. Check existing validation patterns first, then implement consistent validation."
})
// Parallel execution - launch both at once
// Parallel execution - launch both at once in a single message
Task({ subagent_type: "explore", description: "Find frontend components", prompt: "..." })
Task({ subagent_type: "explore", description: "Find backend APIs", prompt: "..." })

View File

@@ -0,0 +1,9 @@
# TaskOutput
- Retrieves output from a running or completed task (background shell, agent, or remote session)
- Takes a task_id parameter identifying the task
- Returns the task output along with status information
- Use block=true (default) to wait for task completion
- Use block=false for non-blocking check of current status
- Task IDs can be found using the /tasks command
- Works with all task types: background shells, async agents, and remote sessions

View File

@@ -0,0 +1,6 @@
# TaskStop
- Stops a running background task by its ID
- Takes a task_id parameter identifying the task to stop
- Returns a success or failure status
- Use this tool when you need to terminate a long-running task

View File

@@ -1,6 +1,11 @@
import { spawn } from "node:child_process";
import { INTERRUPTED_BY_USER } from "../../constants";
import { backgroundProcesses, getNextBashId } from "./process_manager.js";
import {
appendToOutputFile,
backgroundProcesses,
createBackgroundOutputFile,
getNextBashId,
} from "./process_manager.js";
import { getShellEnv } from "./shellEnv.js";
import { buildShellLaunchers } from "./shellLaunchers.js";
import { spawnWithLauncher } from "./shellRunner.js";
@@ -168,6 +173,7 @@ export async function bash(args: BashArgs): Promise<BashResult> {
if (run_in_background) {
const bashId = getNextBashId();
const outputFile = createBackgroundOutputFile(bashId);
const launcher = getBackgroundLauncher(command);
const [executable, ...launcherArgs] = launcher;
if (!executable) {
@@ -190,26 +196,35 @@ export async function bash(args: BashArgs): Promise<BashResult> {
exitCode: null,
lastReadIndex: { stdout: 0, stderr: 0 },
startTime: new Date(),
outputFile,
});
const bgProcess = backgroundProcesses.get(bashId);
if (!bgProcess) {
throw new Error("Failed to track background process state");
}
childProcess.stdout?.on("data", (data: Buffer) => {
const lines = data.toString().split("\n").filter(Boolean);
const text = data.toString();
const lines = text.split("\n").filter(Boolean);
bgProcess.stdout.push(...lines);
// Also write to output file
appendToOutputFile(outputFile, text);
});
childProcess.stderr?.on("data", (data: Buffer) => {
const lines = data.toString().split("\n").filter(Boolean);
const text = data.toString();
const lines = text.split("\n").filter(Boolean);
bgProcess.stderr.push(...lines);
// Also write to output file (prefixed with [stderr])
appendToOutputFile(outputFile, `[stderr] ${text}`);
});
childProcess.on("exit", (code: number | null) => {
bgProcess.status = code === 0 ? "completed" : "failed";
bgProcess.exitCode = code;
appendToOutputFile(outputFile, `\n[exit code: ${code}]\n`);
});
childProcess.on("error", (err: Error) => {
bgProcess.status = "failed";
bgProcess.stderr.push(err.message);
appendToOutputFile(outputFile, `\n[error] ${err.message}\n`);
});
if (timeout && timeout > 0) {
setTimeout(() => {
@@ -217,6 +232,7 @@ export async function bash(args: BashArgs): Promise<BashResult> {
childProcess.kill("SIGTERM");
bgProcess.status = "failed";
bgProcess.stderr.push(`Command timed out after ${timeout}ms`);
appendToOutputFile(outputFile, `\n[timeout after ${timeout}ms]\n`);
}
}, timeout);
}
@@ -224,7 +240,7 @@ export async function bash(args: BashArgs): Promise<BashResult> {
content: [
{
type: "text",
text: `Command running in background with ID: ${bashId}`,
text: `Command running in background with ID: ${bashId}\nOutput file: ${outputFile}`,
},
],
status: "success",

View File

@@ -1,27 +1,82 @@
import { backgroundProcesses } from "./process_manager.js";
import { backgroundProcesses, backgroundTasks } from "./process_manager.js";
import { LIMITS, truncateByChars } from "./truncation.js";
import { validateRequiredParams } from "./validation.js";
interface BashOutputArgs {
shell_id: string;
interface GetTaskOutputArgs {
task_id: string;
block?: boolean;
timeout?: number;
filter?: string;
}
interface BashOutputResult {
interface GetTaskOutputResult {
message: string;
status?: "running" | "completed" | "failed";
}
export async function bash_output(
args: BashOutputArgs,
): Promise<BashOutputResult> {
validateRequiredParams(args, ["shell_id"], "BashOutput");
const { shell_id, filter } = args;
const proc = backgroundProcesses.get(shell_id);
if (!proc)
return { message: `No background process found with ID: ${shell_id}` };
const stdout = proc.stdout.join("\n");
const stderr = proc.stderr.join("\n");
/**
* Core implementation for retrieving task/process output.
* Used by both BashOutput (legacy) and TaskOutput (new).
* Checks both backgroundProcesses (Bash) and backgroundTasks (Task).
*/
export async function getTaskOutput(
args: GetTaskOutputArgs,
): Promise<GetTaskOutputResult> {
const { task_id, block = false, timeout = 30000, filter } = args;
// Check backgroundProcesses first (for Bash background commands)
const proc = backgroundProcesses.get(task_id);
if (proc) {
return getProcessOutput(task_id, proc, block, timeout, filter);
}
// Check backgroundTasks (for Task background subagents)
const task = backgroundTasks.get(task_id);
if (task) {
return getBackgroundTaskOutput(task_id, task, block, timeout, filter);
}
return { message: `No background process found with ID: ${task_id}` };
}
/**
* Get output from a background Bash process.
*/
async function getProcessOutput(
task_id: string,
proc: typeof backgroundProcesses extends Map<string, infer V> ? V : never,
block: boolean,
timeout: number,
filter?: string,
): Promise<GetTaskOutputResult> {
// If blocking, wait for process to complete (or timeout)
if (block && proc.status === "running") {
const startTime = Date.now();
await new Promise<void>((resolve) => {
const checkInterval = setInterval(() => {
const currentProc = backgroundProcesses.get(task_id);
if (!currentProc || currentProc.status !== "running") {
clearInterval(checkInterval);
resolve();
} else if (Date.now() - startTime >= timeout) {
clearInterval(checkInterval);
resolve();
}
}, 100); // Check every 100ms
});
}
// Re-fetch in case status changed while waiting
const currentProc = backgroundProcesses.get(task_id);
if (!currentProc) {
return { message: `Process ${task_id} no longer exists` };
}
const stdout = currentProc.stdout.join("\n");
const stderr = currentProc.stderr.join("\n");
let text = stdout;
if (stderr) text = text ? `${text}\n${stderr}` : stderr;
if (filter) {
text = text
.split("\n")
@@ -31,13 +86,107 @@ export async function bash_output(
const userCwd = process.env.USER_CWD || process.cwd();
// Apply character limit to prevent excessive token usage (same as Bash)
// Apply character limit to prevent excessive token usage
const { content: truncatedOutput } = truncateByChars(
text || "(no output yet)",
LIMITS.BASH_OUTPUT_CHARS,
"BashOutput",
{ workingDirectory: userCwd, toolName: "BashOutput" },
"TaskOutput",
{ workingDirectory: userCwd, toolName: "TaskOutput" },
);
return { message: truncatedOutput };
return {
message: truncatedOutput,
status: currentProc.status,
};
}
/**
* Get output from a background Task (subagent).
*/
async function getBackgroundTaskOutput(
task_id: string,
task: typeof backgroundTasks extends Map<string, infer V> ? V : never,
block: boolean,
timeout: number,
filter?: string,
): Promise<GetTaskOutputResult> {
// If blocking, wait for task to complete (or timeout)
if (block && task.status === "running") {
const startTime = Date.now();
await new Promise<void>((resolve) => {
const checkInterval = setInterval(() => {
const currentTask = backgroundTasks.get(task_id);
if (!currentTask || currentTask.status !== "running") {
clearInterval(checkInterval);
resolve();
} else if (Date.now() - startTime >= timeout) {
clearInterval(checkInterval);
resolve();
}
}, 100); // Check every 100ms
});
}
// Re-fetch in case status changed while waiting
const currentTask = backgroundTasks.get(task_id);
if (!currentTask) {
return { message: `Task ${task_id} no longer exists` };
}
let text = currentTask.output.join("\n");
if (currentTask.error) {
text = text
? `${text}\n[error] ${currentTask.error}`
: `[error] ${currentTask.error}`;
}
if (filter) {
text = text
.split("\n")
.filter((line) => line.includes(filter))
.join("\n");
}
const userCwd = process.env.USER_CWD || process.cwd();
// Apply character limit to prevent excessive token usage
const { content: truncatedOutput } = truncateByChars(
text || "(no output yet)",
LIMITS.TASK_OUTPUT_CHARS,
"TaskOutput",
{ workingDirectory: userCwd, toolName: "TaskOutput" },
);
return {
message: truncatedOutput,
status: currentTask.status,
};
}
// Legacy BashOutput interface
interface BashOutputArgs {
shell_id: string;
filter?: string;
}
interface BashOutputResult {
message: string;
}
/**
* Legacy BashOutput function - wraps getTaskOutput with non-blocking behavior.
*/
export async function bash_output(
args: BashOutputArgs,
): Promise<BashOutputResult> {
validateRequiredParams(args, ["shell_id"], "BashOutput");
const { shell_id, filter } = args;
const result = await getTaskOutput({
task_id: shell_id,
block: false, // BashOutput is always non-blocking (legacy behavior)
filter,
});
return { message: result.message };
}

View File

@@ -11,12 +11,22 @@ import {
getAllSubagentConfigs,
} from "../../agent/subagents";
import { spawnSubagent } from "../../agent/subagents/manager";
import { addToMessageQueue } from "../../cli/helpers/messageQueueBridge.js";
import {
completeSubagent,
generateSubagentId,
getSnapshot as getSubagentSnapshot,
registerSubagent,
} from "../../cli/helpers/subagentState.js";
import { formatTaskNotification } from "../../cli/helpers/taskNotifications.js";
import { runSubagentStopHooks } from "../../hooks";
import {
appendToOutputFile,
type BackgroundTask,
backgroundTasks,
createBackgroundOutputFile,
getNextTaskId,
} from "./process_manager.js";
import { LIMITS, truncateByChars } from "./truncation.js";
import { validateRequiredParams } from "./validation";
@@ -28,6 +38,8 @@ interface TaskArgs {
model?: string;
agent_id?: string; // Deploy an existing agent instead of creating new
conversation_id?: string; // Resume from an existing conversation
run_in_background?: boolean; // Run the task in background
max_turns?: number; // Maximum number of agentic turns
toolCallId?: string; // Injected by executeTool for linking subagent to parent tool call
signal?: AbortSignal; // Injected by executeTool for interruption handling
}
@@ -108,7 +120,184 @@ export async function task(args: TaskArgs): Promise<string> {
// Register subagent with state store for UI display
const subagentId = generateSubagentId();
registerSubagent(subagentId, subagent_type, description, toolCallId);
const isBackground = args.run_in_background ?? false;
registerSubagent(
subagentId,
subagent_type,
description,
toolCallId,
isBackground,
);
// Handle background execution
if (isBackground) {
const taskId = getNextTaskId();
const outputFile = createBackgroundOutputFile(taskId);
// Create abort controller for potential cancellation
const abortController = new AbortController();
// Register background task
const bgTask: BackgroundTask = {
description,
subagentType: subagent_type,
subagentId,
status: "running",
output: [],
startTime: new Date(),
outputFile,
abortController,
};
backgroundTasks.set(taskId, bgTask);
// Write initial status to output file
appendToOutputFile(
outputFile,
`[Task started: ${description}]\n[subagent_type: ${subagent_type}]\n\n`,
);
// Fire-and-forget: run subagent without awaiting
spawnSubagent(
subagent_type,
prompt,
model,
subagentId,
abortController.signal,
args.agent_id,
args.conversation_id,
args.max_turns,
)
.then((result) => {
// Update background task state
bgTask.status = result.success ? "completed" : "failed";
if (result.error) {
bgTask.error = result.error;
}
// Build output header
const header = [
`subagent_type=${subagent_type}`,
result.agentId ? `agent_id=${result.agentId}` : undefined,
result.conversationId
? `conversation_id=${result.conversationId}`
: undefined,
]
.filter(Boolean)
.join(" ");
// Write result to output file
if (result.success) {
appendToOutputFile(outputFile, `${header}\n\n${result.report}\n`);
bgTask.output.push(result.report || "");
} else {
appendToOutputFile(
outputFile,
`[error] ${result.error || "Subagent execution failed"}\n`,
);
}
appendToOutputFile(
outputFile,
`\n[Task ${result.success ? "completed" : "failed"}]\n`,
);
// Mark subagent as completed in state store
completeSubagent(subagentId, {
success: result.success,
error: result.error,
totalTokens: result.totalTokens,
});
const subagentSnapshot = getSubagentSnapshot();
const toolUses = subagentSnapshot.agents.find(
(agent) => agent.id === subagentId,
)?.toolCalls.length;
const durationMs = Math.max(0, Date.now() - bgTask.startTime.getTime());
// Build and truncate the result (same as foreground path)
const fullResult = result.success
? `${header}\n\n${result.report || ""}`
: result.error || "Subagent execution failed";
const userCwd = process.env.USER_CWD || process.cwd();
const { content: truncatedResult } = truncateByChars(
fullResult,
LIMITS.TASK_OUTPUT_CHARS,
"Task",
{ workingDirectory: userCwd, toolName: "Task" },
);
// Format and queue notification for auto-firing when idle
const notificationXml = formatTaskNotification({
taskId,
status: result.success ? "completed" : "failed",
summary: `Agent "${description}" ${result.success ? "completed" : "failed"}`,
result: truncatedResult,
outputFile,
usage: {
totalTokens: result.totalTokens,
toolUses,
durationMs,
},
});
addToMessageQueue({ kind: "task_notification", text: notificationXml });
// Run SubagentStop hooks (fire-and-forget)
runSubagentStopHooks(
subagent_type,
subagentId,
result.success,
result.error,
result.agentId,
result.conversationId,
).catch(() => {
// Silently ignore hook errors
});
})
.catch((error) => {
const errorMessage =
error instanceof Error ? error.message : String(error);
bgTask.status = "failed";
bgTask.error = errorMessage;
appendToOutputFile(outputFile, `[error] ${errorMessage}\n`);
// Mark subagent as completed with error
completeSubagent(subagentId, { success: false, error: errorMessage });
const subagentSnapshot = getSubagentSnapshot();
const toolUses = subagentSnapshot.agents.find(
(agent) => agent.id === subagentId,
)?.toolCalls.length;
const durationMs = Math.max(0, Date.now() - bgTask.startTime.getTime());
// Format and queue notification for auto-firing when idle
const notificationXml = formatTaskNotification({
taskId,
status: "failed",
summary: `Agent "${description}" failed`,
result: errorMessage,
outputFile,
usage: {
toolUses,
durationMs,
},
});
addToMessageQueue({ kind: "task_notification", text: notificationXml });
// Run SubagentStop hooks for error case
runSubagentStopHooks(
subagent_type,
subagentId,
false,
errorMessage,
args.agent_id,
args.conversation_id,
).catch(() => {
// Silently ignore hook errors
});
});
// Return immediately with task ID and output file
return `Task running in background with ID: ${taskId}\nOutput file: ${outputFile}`;
}
try {
const result = await spawnSubagent(
@@ -119,6 +308,7 @@ export async function task(args: TaskArgs): Promise<string> {
signal,
args.agent_id,
args.conversation_id,
args.max_turns,
);
// Mark subagent as completed in state store

View File

@@ -0,0 +1,30 @@
import { getTaskOutput } from "./BashOutput.js";
import { validateRequiredParams } from "./validation.js";
interface TaskOutputArgs {
task_id: string;
block?: boolean;
timeout?: number;
}
interface TaskOutputResult {
message: string;
status?: "running" | "completed" | "failed";
}
/**
* TaskOutput - retrieves output from a running or completed background task.
* Supports blocking (wait for completion) and timeout.
*/
export async function task_output(
args: TaskOutputArgs,
): Promise<TaskOutputResult> {
validateRequiredParams(args, ["task_id"], "TaskOutput");
const { task_id, block = true, timeout = 30000 } = args;
return getTaskOutput({
task_id,
block,
timeout,
});
}

View File

@@ -0,0 +1,37 @@
import { kill_bash } from "./KillBash.js";
import { backgroundTasks } from "./process_manager.js";
import { validateRequiredParams } from "./validation.js";
interface TaskStopArgs {
task_id?: string;
shell_id?: string; // deprecated, for backwards compatibility
}
interface TaskStopResult {
killed: boolean;
}
export async function task_stop(args: TaskStopArgs): Promise<TaskStopResult> {
// Support both task_id and deprecated shell_id
let id = args.task_id ?? args.shell_id;
if (!id) {
validateRequiredParams(args, ["task_id"], "TaskStop");
id = ""; // unreachable, validateRequiredParams throws
}
// Check if this is a background Task (subagent)
const task = backgroundTasks.get(id);
if (task) {
if (task.status === "running" && task.abortController) {
task.abortController.abort();
task.status = "failed";
task.error = "Aborted by user";
return { killed: true };
}
// Task exists but isn't running or doesn't have abort controller
return { killed: false };
}
// Fall back to killing a Bash background process
return kill_bash({ shell_id: id });
}

View File

@@ -7,8 +7,65 @@ export interface BackgroundProcess {
exitCode: number | null;
lastReadIndex: { stdout: number; stderr: number };
startTime?: Date;
outputFile?: string; // File path for persistent output
}
export interface BackgroundTask {
description: string;
subagentType: string;
subagentId: string;
status: "running" | "completed" | "failed";
output: string[];
error?: string;
startTime: Date;
outputFile: string;
abortController?: AbortController;
}
export const backgroundProcesses = new Map<string, BackgroundProcess>();
export const backgroundTasks = new Map<string, BackgroundTask>();
let bashIdCounter = 1;
export const getNextBashId = () => `bash_${bashIdCounter++}`;
let taskIdCounter = 1;
export const getNextTaskId = () => `task_${taskIdCounter++}`;
/**
* Get a temp directory for background task output files.
* Uses LETTA_SCRATCHPAD if set, otherwise falls back to os.tmpdir().
*/
export function getBackgroundOutputDir(): string {
const scratchpad = process.env.LETTA_SCRATCHPAD;
if (scratchpad) {
return scratchpad;
}
// Fall back to system temp with a letta-specific subdirectory
const os = require("node:os");
const path = require("node:path");
return path.join(os.tmpdir(), "letta-background");
}
/**
* Create a unique output file path for a background process/task.
*/
export function createBackgroundOutputFile(id: string): string {
const fs = require("node:fs");
const path = require("node:path");
const dir = getBackgroundOutputDir();
// Ensure directory exists
fs.mkdirSync(dir, { recursive: true });
const filePath = path.join(dir, `${id}.log`);
// Create empty file
fs.writeFileSync(filePath, "");
return filePath;
}
/**
* Append content to a background output file.
*/
export function appendToOutputFile(filePath: string, content: string): void {
const fs = require("node:fs");
fs.appendFileSync(filePath, content);
}

View File

@@ -63,13 +63,13 @@ export function getInternalToolName(serverName: string): string {
export const ANTHROPIC_DEFAULT_TOOLS: ToolName[] = [
"AskUserQuestion",
"Bash",
"BashOutput",
"TaskOutput",
"Edit",
"EnterPlanMode",
"ExitPlanMode",
"Glob",
"Grep",
"KillBash",
"TaskStop",
// "MultiEdit",
// "LS",
"Read",
@@ -149,12 +149,14 @@ const TOOL_PERMISSIONS: Record<ToolName, { requiresApproval: boolean }> = {
AskUserQuestion: { requiresApproval: true },
Bash: { requiresApproval: true },
BashOutput: { requiresApproval: false },
TaskOutput: { requiresApproval: false },
Edit: { requiresApproval: true },
EnterPlanMode: { requiresApproval: true },
ExitPlanMode: { requiresApproval: false },
Glob: { requiresApproval: false },
Grep: { requiresApproval: false },
KillBash: { requiresApproval: true },
TaskStop: { requiresApproval: true },
LS: { requiresApproval: false },
MultiEdit: { requiresApproval: true },
Read: { requiresApproval: false },

View File

@@ -37,6 +37,7 @@
},
"multiSelect": {
"type": "boolean",
"default": false,
"description": "Set to true to allow the user to select multiple options instead of just one. Use when choices are not mutually exclusive."
}
},

View File

@@ -11,11 +11,11 @@
},
"description": {
"type": "string",
"description": "Clear, concise description of what this command does in 5-10 words, in active voice. Examples:\nInput: ls\nOutput: List files in current directory\n\nInput: git status\nOutput: Show working tree status\n\nInput: npm install\nOutput: Install package dependencies\n\nInput: mkdir foo\nOutput: Create directory 'foo'"
"description": "Clear, concise description of what this command does in active voice. Never use words like \"complex\" or \"risk\" in the description - just describe what it does.\n\nFor simple commands (git, npm, standard CLI tools), keep it brief (5-10 words):\n- ls → \"List files in current directory\"\n- git status → \"Show working tree status\"\n- npm install → \"Install package dependencies\"\n\nFor commands that are harder to parse at a glance (piped commands, obscure flags, etc.), add enough context to clarify what it does:\n- find . -name \"*.tmp\" -exec rm {} \\; → \"Find and delete all .tmp files recursively\"\n- git reset --hard origin/main → \"Discard all local changes and match remote main\"\n- curl -s url | jq '.data[]' → \"Fetch JSON from URL and extract data array elements\""
},
"run_in_background": {
"type": "boolean",
"description": "Set to true to run this command in the background. Use BashOutput to read the output later."
"description": "Set to true to run this command in the background. Use TaskOutput to read the output later."
}
},
"required": ["command"],

View File

@@ -27,6 +27,10 @@
"description": "Number of lines to show after each match (rg -A). Requires output_mode: \"content\", ignored otherwise."
},
"-C": {
"type": "number",
"description": "Alias for context."
},
"context": {
"type": "number",
"description": "Number of lines to show before and after each match (rg -C). Requires output_mode: \"content\", ignored otherwise."
},
@@ -44,7 +48,7 @@
},
"head_limit": {
"type": "number",
"description": "Limit output to first N lines/entries, equivalent to \"| head -N\". Works across all output modes: content (limits output lines), files_with_matches (limits file paths), count (limits count entries). Defaults to 100 (0 = unlimited)."
"description": "Limit output to first N lines/entries, equivalent to \"| head -N\". Works across all output modes: content (limits output lines), files_with_matches (limits file paths), count (limits count entries). Defaults to 0 (unlimited)."
},
"offset": {
"type": "number",

View File

@@ -1,28 +1,41 @@
{
"type": "object",
"properties": {
"command": {
"description": {
"type": "string",
"enum": ["run", "refresh"],
"description": "The operation to perform: \"run\" to spawn a subagent (default), \"refresh\" to re-scan the .letta/agents/ directories and update the available subagents list"
},
"subagent_type": {
"type": "string",
"description": "The type of specialized agent to use. Available agents are discovered from .letta/agents/ directory. Required for \"run\" command."
"description": "A short (3-5 word) description of the task"
},
"prompt": {
"type": "string",
"description": "The task for the agent to perform. Required for \"run\" command."
"description": "The task for the agent to perform"
},
"description": {
"subagent_type": {
"type": "string",
"description": "A short (3-5 word) description of the task. Required for \"run\" command."
"description": "The type of specialized agent to use for this task"
},
"model": {
"type": "string",
"description": "Optional model to use for this agent. If not specified, uses the recommended model for the subagent type."
"description": "Optional model to use for this agent. If not specified, inherits from parent. Prefer lighter models for quick, straightforward tasks to minimize cost and latency."
},
"run_in_background": {
"type": "boolean",
"description": "Set to true to run this agent in the background. The tool result will include an output_file path - use Read tool or Bash tail to check on output."
},
"agent_id": {
"type": "string",
"description": "Deploy an existing agent instead of creating a new one. Starts a new conversation with that agent."
},
"conversation_id": {
"type": "string",
"description": "Resume from an existing conversation. Does NOT require agent_id (conversation IDs are unique and encode the agent)."
},
"max_turns": {
"type": "integer",
"exclusiveMinimum": 0,
"description": "Maximum number of agentic turns (API round-trips) before stopping."
}
},
"required": ["description", "prompt", "subagent_type"],
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}

View File

@@ -0,0 +1,24 @@
{
"type": "object",
"properties": {
"task_id": {
"type": "string",
"description": "The task ID to get output from"
},
"block": {
"type": "boolean",
"default": true,
"description": "Whether to wait for completion"
},
"timeout": {
"type": "number",
"default": 30000,
"minimum": 0,
"maximum": 600000,
"description": "Max wait time in ms"
}
},
"required": ["task_id"],
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}

View File

@@ -0,0 +1,16 @@
{
"type": "object",
"properties": {
"task_id": {
"type": "string",
"description": "The ID of the background task to stop"
},
"shell_id": {
"type": "string",
"description": "Deprecated: use task_id instead"
}
},
"required": ["task_id"],
"additionalProperties": false,
"$schema": "http://json-schema.org/draft-07/schema#"
}

View File

@@ -27,6 +27,8 @@ import ShellDescription from "./descriptions/Shell.md";
import ShellCommandDescription from "./descriptions/ShellCommand.md";
import SkillDescription from "./descriptions/Skill.md";
import TaskDescription from "./descriptions/Task.md";
import TaskOutputDescription from "./descriptions/TaskOutput.md";
import TaskStopDescription from "./descriptions/TaskStop.md";
import TodoWriteDescription from "./descriptions/TodoWrite.md";
import UpdatePlanDescription from "./descriptions/UpdatePlan.md";
import ViewImageDescription from "./descriptions/ViewImage.md";
@@ -62,6 +64,8 @@ import { shell } from "./impl/Shell";
import { shell_command } from "./impl/ShellCommand";
import { skill } from "./impl/Skill";
import { task } from "./impl/Task";
import { task_output } from "./impl/TaskOutput";
import { task_stop } from "./impl/TaskStop";
import { todo_write } from "./impl/TodoWrite";
import { update_plan } from "./impl/UpdatePlan";
import { view_image } from "./impl/ViewImage";
@@ -97,6 +101,8 @@ import ShellSchema from "./schemas/Shell.json";
import ShellCommandSchema from "./schemas/ShellCommand.json";
import SkillSchema from "./schemas/Skill.json";
import TaskSchema from "./schemas/Task.json";
import TaskOutputSchema from "./schemas/TaskOutput.json";
import TaskStopSchema from "./schemas/TaskStop.json";
import TodoWriteSchema from "./schemas/TodoWrite.json";
import UpdatePlanSchema from "./schemas/UpdatePlan.json";
import ViewImageSchema from "./schemas/ViewImage.json";
@@ -158,6 +164,16 @@ const toolDefinitions = {
description: KillBashDescription.trim(),
impl: kill_bash as unknown as ToolImplementation,
},
TaskOutput: {
schema: TaskOutputSchema,
description: TaskOutputDescription.trim(),
impl: task_output as unknown as ToolImplementation,
},
TaskStop: {
schema: TaskStopSchema,
description: TaskStopDescription.trim(),
impl: task_stop as unknown as ToolImplementation,
},
LS: {
schema: LSSchema,
description: LSDescription.trim(),