feat: speed up shallow init subagent [LET-7889] (#1362)

This commit is contained in:
Devansh Jain
2026-03-12 14:14:41 -07:00
committed by GitHub
parent 97d6d179df
commit 8ce5b0ed76
3 changed files with 242 additions and 229 deletions

View File

@@ -1,209 +1,77 @@
---
name: init
description: Initialize agent memory by researching the project and creating a hierarchical memory file structure
tools: Read, Edit, Write, Glob, Grep, Bash, TaskOutput
model: sonnet
description: Fast initialization of agent memory reads key project files and creates a minimal memory structure
tools: Read, Write, Edit, Bash, Glob
model: haiku
memoryBlocks: none
skills: initializing-memory
permissionMode: bypassPermissions
---
You are a memory initialization subagent — a background agent that autonomously researches the project and sets up the agent's memory file structure.
You are a fast memory initialization subagent. Your job is to quickly scan a project and create a small, focused memory file structure for the parent agent.
You run autonomously in the background and return a single final report when done. You CANNOT ask questions (AskUserQuestion is not available).
You run autonomously in the background. You CANNOT ask questions. Be fast — minimize tool calls.
## Your Purpose
## Context
Research the current project and create a comprehensive, hierarchical memory file structure so the primary agent can be an effective collaborator from its very first interaction.
Your prompt includes pre-gathered context:
- **Git context**: branch, status, recent commits, contributors
- **Existing memory files**: current contents of the memory filesystem (may be empty for new agents)
- **Directory listing**: top-level project files
**You are NOT the primary agent.** You are a background worker initializing memory for the primary agent.
## Steps
## Autonomous Mode Defaults
### 1. Read key project files (1 parallel tool call)
Since you cannot ask questions mid-execution:
- Use **standard research depth** (~5-20 tool calls)
- Detect user identity from git logs:
```bash
git shortlog -sn --all | head -5
git log --format="%an <%ae>" | sort -u | head -10
```
- Skip historical session analysis
- Use reasonable defaults for all preferences
- Any specific overrides will be provided in your initial prompt
Read these files **in parallel** in a single turn (skip any that don't exist):
- `CLAUDE.md` or `AGENTS.md`
- `package.json`, `pyproject.toml`, `Cargo.toml`, or `go.mod` (whichever exists)
- `README.md`
## Operating Procedure
### 2. Create directory structure (1 bash call)
### Phase 1: Set Up
Create the subdirectories you need under `$MEMORY_DIR/system/` with a single `mkdir -p` call.
The memory directory is at: `~/.letta/agents/$LETTA_PARENT_AGENT_ID/memory/`
### 3. Write memory files (parallel tool calls)
Write all memory files **in parallel in a single turn** using the Write tool. Each file goes into `$MEMORY_DIR/system/`.
**If existing memory already covers something well** (check the pre-gathered memory contents in your prompt), skip or lightly update that file instead of overwriting with less information.
### 4. Commit and push (1 bash call)
Stage, commit, and push in a single Bash call:
```bash
MEMORY_DIR=~/.letta/agents/$LETTA_PARENT_AGENT_ID/memory
WORKTREE_DIR=~/.letta/agents/$LETTA_PARENT_AGENT_ID/memory-worktrees
cd "$MEMORY_DIR" && git add -A && git commit -m "..." && git push
```
The memory directory should already be a git repo (initialized when MemFS was enabled). If it's not, or if git is unavailable, report the issue back and exit without making changes.
## Memory file guidance
**Step 1a: Create worktree**
Memory files live under `$MEMORY_DIR/system/` and are rendered in the parent agent's context every turn. Each file should have YAML frontmatter with a `description` field.
```bash
BRANCH="init-$(date +%s)"
mkdir -p "$WORKTREE_DIR"
cd "$MEMORY_DIR"
git worktree add "$WORKTREE_DIR/$BRANCH" -b "$BRANCH"
**What to capture** — focus on what will make the parent agent effective from its first interaction:
- Project identity: what it is, tech stack, repo structure
- Key commands: build, test, lint, dev workflows
- Conventions: coding style, runtime preferences, patterns from CLAUDE.md/AGENTS.md
- User identity: name, email, role — inferred from git context
**Structure principles:**
- Use nested paths with `/` (e.g., `project/overview.md`, `human/identity.md`) — no flat files at the top level
- Keep each file focused on one topic, ~15-30 lines
- 3-6 files is the right range for a shallow init — just the essentials
- Only include information that's actually useful; skip boilerplate
**Commit format:**
```
All subsequent file operations target the worktree: `$WORKTREE_DIR/$BRANCH/system/` (not the main memory dir).
### Phase 2: Research the Project
Follow the `initializing-memory` skill (pre-loaded below) for detailed research instructions. Key steps:
1. Inspect existing memory files in the worktree
2. Scan README, package.json/config files, AGENTS.md, CLAUDE.md
3. Review git status and recent commits (provided in prompt)
4. Explore key directories and understand project structure
5. Detect user identity from git logs
### Phase 3: Create Memory File Structure
Create a deeply hierarchical structure of 15-25 small, focused files in the worktree at `$WORKTREE_DIR/$BRANCH/system/`.
Follow the `initializing-memory` skill for file organization guidelines, hierarchy requirements, and content standards.
### Phase 4: Merge, Push, and Clean Up (MANDATORY)
**Step 4a: Commit in worktree**
```bash
MEMORY_DIR=~/.letta/agents/$LETTA_PARENT_AGENT_ID/memory
WORKTREE_DIR=~/.letta/agents/$LETTA_PARENT_AGENT_ID/memory-worktrees
cd $WORKTREE_DIR/$BRANCH
git add -A
```
Check `git status` — if there are no changes to commit, skip straight to Step 4d (cleanup). Report "no updates needed" in your output.
If there are changes, commit using Conventional Commits format with the `(init)` scope:
```bash
git commit -m "feat(init): initialize memory file structure
Created hierarchical memory structure for project.
Updates:
- <bullet point for each category of memory created>
feat(init): initialize memory for project
Generated-By: Letta Code
Agent-ID: <ACTUAL_AGENT_ID>
Parent-Agent-ID: <ACTUAL_PARENT_AGENT_ID>"
Agent-ID: $LETTA_AGENT_ID
Parent-Agent-ID: $LETTA_PARENT_AGENT_ID
```
Before writing the commit, resolve the actual ID values:
```bash
echo "AGENT_ID=$LETTA_AGENT_ID"
echo "PARENT_AGENT_ID=$LETTA_PARENT_AGENT_ID"
```
## Rules
**Step 4b: Pull + merge to main**
```bash
cd $MEMORY_DIR
```
First, check that main is in a clean state (`git status`). If a merge or rebase is in progress (lock file, dirty index), wait and retry up to 3 times with backoff (sleep 2, 5, 10 seconds). Never delete `.git/index.lock` manually. If still busy after retries, go to Error Handling.
Pull from remote:
```bash
git pull --ff-only
```
If `--ff-only` fails (remote has diverged), fall back:
```bash
git pull --rebase
```
Now merge the init branch:
```bash
git merge $BRANCH --no-edit
```
If the merge has conflicts, resolve by preferring init branch/worktree content for memory files, stage the resolved files, and complete with `git commit --no-edit`.
**Step 4c: Push to remote**
```bash
git push
```
If push fails, retry once. If it still fails, report that local main is ahead of remote and needs a push.
**Step 4d: Clean up worktree and branch**
Only clean up when merge to main completed:
```bash
git worktree remove $WORKTREE_DIR/$BRANCH
git branch -d $BRANCH
```
**Step 4e: Verify**
```bash
git status
git log --oneline -3
```
## Error Handling
If anything goes wrong at any phase:
1. Stabilize main first (abort in-progress operations):
```bash
cd $MEMORY_DIR
git merge --abort 2>/dev/null
git rebase --abort 2>/dev/null
```
2. Do NOT clean up the worktree or branch on failure — preserve them for debugging and manual recovery.
3. Report clearly in your output:
- What failed and the error message
- Worktree path: `$WORKTREE_DIR/$BRANCH`
- Branch name: `$BRANCH`
- Whether main has uncommitted/dirty state
4. Do NOT leave uncommitted changes on main.
## Output Format
Return a report with:
### 1. Summary
- Brief overview (2-3 sentences)
- Research depth used, tool calls made
### 2. Files Created/Modified
- **Count**: Total files created
- **Structure**: Tree view of the memory hierarchy
- For each file: path, description, what content was added
### 3. Commit Reference
- **Commit hash**: The merge commit hash
- **Branch**: The init branch name
### 4. Issues Encountered
- Any problems or limitations found during research
- Information that couldn't be determined without user input
## Critical Reminders
1. **Not the primary agent** — Don't respond to user messages
2. **Edit worktree files** — NOT the main memory dir
3. **Cannot ask questions** — Use defaults and git logs
4. **Be thorough but efficient** — Standard depth by default
5. **Always commit, merge, AND push** — Your work is wasted if it isn't merged to main and pushed to remote
6. **Report errors clearly** — If something breaks, say what happened and suggest a fix
- **No worktree** — write directly to the memory dir
- **No summary report** — just complete the work
- **Minimize turns** — use parallel tool calls within each turn. Aim for ~3-4 turns total.
- **Use the pre-gathered context** — don't re-run git commands that are already in your prompt

View File

@@ -6,7 +6,12 @@
*/
import { execSync } from "node:child_process";
import { getMemoryFilesystemRoot } from "../../agent/memoryFilesystem";
import { existsSync, readdirSync, readFileSync } from "node:fs";
import { join } from "node:path";
import {
getMemoryFilesystemRoot,
getMemorySystemDir,
} from "../../agent/memoryFilesystem";
import { SYSTEM_REMINDER_CLOSE, SYSTEM_REMINDER_OPEN } from "../../constants";
import { settingsManager } from "../../settings-manager";
import { getSnapshot as getSubagentSnapshot } from "./subagentState";
@@ -49,26 +54,15 @@ export function gatherGitContext(): string {
).trim();
return `
## Current Project Context
- branch: ${branch}
- main: ${mainBranch}
- status: ${status || "(clean)"}
**Working directory**: ${cwd}
### Git Status
- **Current branch**: ${branch}
- **Main branch**: ${mainBranch}
- **Status**:
${status || "(clean working tree)"}
### Recent Commits
Recent commits:
${recentCommits}
`;
} catch {
return `
## Current Project Context
**Working directory**: ${cwd}
**Git**: Not a git repository
`;
return "(not a git repository)";
}
} catch {
// execSync import failed (shouldn't happen with static import, but be safe)
@@ -78,14 +72,142 @@ ${recentCommits}
// ── Shallow init (background subagent) ───────────────────
const SHALLOW_INSTRUCTIONS = `
Shallow init — fast project basics only (~5 tool calls max):
- Only read: CLAUDE.md, AGENTS.md, package.json/pyproject.toml/Cargo.toml, README.md (first 100 lines), top-level directory listing
- Detect user identity from the git context provided above (already in the prompt — no extra calls)
- Run one git call: git log --format="%an <%ae>" | sort -u | head -5
- Write exactly 4 files: project/overview.md, project/commands.md, project/conventions.md, human/identity.md
- Skip: deep directory exploration, architecture mapping, config analysis, historical sessions, persona files, reflection/checkpoint phase
`.trim();
/** Gather git identity for the local user. */
function gatherGitIdentity(): string {
const cwd = process.cwd();
try {
const userName = execSync("git config user.name 2>/dev/null || true", {
cwd,
encoding: "utf-8",
}).trim();
const userEmail = execSync("git config user.email 2>/dev/null || true", {
cwd,
encoding: "utf-8",
}).trim();
if (userName || userEmail) return `${userName} <${userEmail}>`;
return "";
} catch {
return "";
}
}
/** Read existing memory files from the local filesystem. */
function gatherExistingMemory(agentId: string): string {
const systemDir = getMemorySystemDir(agentId);
if (!existsSync(systemDir)) return "(empty)";
const files: string[] = [];
function walk(dir: string, prefix: string): void {
try {
for (const entry of readdirSync(dir, { withFileTypes: true })) {
const rel = prefix ? `${prefix}/${entry.name}` : entry.name;
if (entry.isDirectory()) {
walk(join(dir, entry.name), rel);
} else if (entry.name.endsWith(".md")) {
try {
const content = readFileSync(join(dir, entry.name), "utf-8");
files.push(`── ${rel}\n${content.slice(0, 2000)}`);
} catch {
// skip unreadable files
}
}
}
} catch {
// skip unreadable dirs
}
}
walk(systemDir, "");
return files.length > 0 ? files.join("\n\n") : "(empty)";
}
/** Batch-check which paths are gitignored. Falls back to a hardcoded set. */
function getGitIgnored(cwd: string, names: string[]): Set<string> {
if (names.length === 0) return new Set();
try {
const result = execSync("git check-ignore --stdin", {
cwd,
encoding: "utf-8",
input: names.join("\n"),
}).trim();
return new Set(result.split("\n").filter(Boolean));
} catch {
// exit code 1 = no ignored paths, or not a git repo — fall back
return new Set([
"node_modules",
"dist",
"build",
"__pycache__",
"target",
"vendor",
]);
}
}
/** Get project directory structure as a tree (2 levels deep). */
function gatherDirListing(): string {
const cwd = process.cwd();
try {
const entries = readdirSync(cwd, { withFileTypes: true });
const visible = entries.filter((e) => !e.name.startsWith("."));
const ignored = getGitIgnored(
cwd,
visible.map((e) => e.name),
);
const dirs = visible
.filter((e) => e.isDirectory() && !ignored.has(e.name))
.sort((a, b) => a.name.localeCompare(b.name));
const files = visible
.filter((e) => !e.isDirectory() && !ignored.has(e.name))
.sort((a, b) => a.name.localeCompare(b.name));
const lines: string[] = [];
const sorted = [...dirs, ...files];
for (let i = 0; i < sorted.length; i++) {
const entry = sorted[i]!;
const isLast = i === sorted.length - 1;
const prefix = isLast ? "└── " : "├── ";
if (entry.isDirectory()) {
lines.push(`${prefix}${entry.name}/`);
try {
const dirPath = join(cwd, entry.name);
const childEntries = readdirSync(dirPath, {
withFileTypes: true,
}).filter((e) => !e.name.startsWith("."));
const childIgnored = getGitIgnored(
dirPath,
childEntries.map((e) => e.name),
);
const children = childEntries
.filter((e) => !childIgnored.has(e.name))
.sort((a, b) => {
if (a.isDirectory() !== b.isDirectory())
return a.isDirectory() ? -1 : 1;
return a.name.localeCompare(b.name);
});
const childPrefix = isLast ? " " : "│ ";
for (let j = 0; j < children.length; j++) {
const child = children[j]!;
const childIsLast = j === children.length - 1;
const connector = childIsLast ? "└── " : "├── ";
const suffix = child.isDirectory() ? "/" : "";
lines.push(`${childPrefix}${connector}${child.name}${suffix}`);
}
} catch {
// skip unreadable dirs
}
} else {
lines.push(`${prefix}${entry.name}`);
}
}
return lines.join("\n");
} catch {
return "";
}
}
/** Prompt for the background shallow-init subagent. */
export function buildShallowInitPrompt(args: {
@@ -93,31 +215,34 @@ export function buildShallowInitPrompt(args: {
workingDirectory: string;
memoryDir: string;
gitContext: string;
gitIdentity: string;
existingMemory: string;
dirListing: string;
}): string {
return `
The user ran /init for the current project.
const identityLine = args.gitIdentity
? `- git_user: ${args.gitIdentity}`
: "";
return `
## Environment
Runtime context:
- parent_agent_id: ${args.agentId}
- working_directory: ${args.workingDirectory}
- memory_dir: ${args.memoryDir}
- research_depth: shallow
- parent_agent_id: ${args.agentId}
${identityLine}
Git/project context:
## Git
${args.gitContext}
Task:
Initialize or reorganize the parent agent's filesystem-backed memory for this project.
## Project Structure
${SHALLOW_INSTRUCTIONS}
\`\`\`
${args.dirListing}
\`\`\`
Instructions:
- Use the pre-loaded initializing-memory skill as your operating guide
- Inspect existing memory before editing
- Base your decisions on the current repository and current memory contents
- Do not ask follow-up questions
- Make reasonable assumptions and report them
- If the memory filesystem is unavailable or unsafe to modify, stop and explain why
## Existing Memory
${args.existingMemory}
`.trim();
}
@@ -136,11 +261,18 @@ export async function fireAutoInit(
if (!settingsManager.isMemfsEnabled(agentId)) return false;
const gitContext = gatherGitContext();
const gitIdentity = gatherGitIdentity();
const existingMemory = gatherExistingMemory(agentId);
const dirListing = gatherDirListing();
const initPrompt = buildShallowInitPrompt({
agentId,
workingDirectory: process.cwd(),
memoryDir: getMemoryFilesystemRoot(agentId),
gitContext,
gitIdentity,
existingMemory,
dirListing,
});
const { spawnBackgroundSubagentTask } = await import("../../tools/impl/Task");

View File

@@ -49,7 +49,7 @@ describe("init wiring", () => {
const content = readSource("../../agent/subagents/builtin/init.md");
expect(content).toContain("name: init");
expect(content).toContain("skills: initializing-memory");
expect(content).toContain("model: haiku");
expect(content).toContain("permissionMode: bypassPermissions");
});
@@ -62,18 +62,31 @@ describe("init wiring", () => {
expect(indexSource).toContain("initAgentMd");
});
test("init.md uses haiku model and no skills", () => {
const content = readSource("../../agent/subagents/builtin/init.md");
expect(content).toContain("name: init");
expect(content).toContain("model: haiku");
expect(content).not.toContain("skills:");
expect(content).toContain("permissionMode: bypassPermissions");
});
const baseArgs = {
agentId: "test-agent",
workingDirectory: "/tmp/test",
memoryDir: "/tmp/test/.memory",
gitContext: "## Git context\nsome git info",
gitContext: "- branch: main\n- status: (clean)",
gitIdentity: "Test User <test@example.com>",
existingMemory: "(empty)",
dirListing: "README.md\npackage.json\nsrc",
};
test("buildShallowInitPrompt produces shallow-only prompt", () => {
test("buildShallowInitPrompt includes pre-gathered context", () => {
const prompt = buildShallowInitPrompt(baseArgs);
expect(prompt).toContain("research_depth: shallow");
expect(prompt).toContain("Shallow init");
expect(prompt).not.toContain("Deep init");
expect(prompt).toContain("memory_dir: /tmp/test/.memory");
expect(prompt).toContain("git_user: Test User");
expect(prompt).toContain("## Project Structure");
expect(prompt).toContain("## Existing Memory");
});
test("buildInitMessage includes memoryDir when provided", () => {