letta-server/fern/pages/evals/cli/commands.mdx

# CLI Commands

The **letta-evals** command-line interface lets you run evaluations, validate configurations, and inspect available components.

<Note>
**Quick overview:**
- **`run`** - Execute an evaluation suite (most common)
- **`validate`** - Check suite configuration without running
- **`list-extractors`** - Show available extractors
- **`list-graders`** - Show available grader functions
- **Exit codes** - 0 for pass, 1 for fail (perfect for CI/CD)
</Note>

**Typical workflow:**
1. Validate your suite: `letta-evals validate suite.yaml`
2. Run evaluation: `letta-evals run suite.yaml --output results/`
3. Check exit code: `echo $?` (0 = passed, 1 = failed)

## run

Run an evaluation suite.

```bash
letta-evals run <suite.yaml> [options]
```

### Arguments

- `suite.yaml`: Path to the suite configuration file (required)

### Options

#### --output, -o
Save results to a directory.

```bash
letta-evals run suite.yaml --output results/
```

Creates:
- `results/header.json`: Evaluation metadata
- `results/summary.json`: Aggregate metrics and configuration
- `results/results.jsonl`: Per-sample results (one JSON per line)

#### --quiet, -q
Quiet mode - only show pass/fail result.

```bash
letta-evals run suite.yaml --quiet
```

Output:
```
✓ PASSED
```

#### --max-concurrent
Maximum concurrent sample evaluations. **Default**: 15

```bash
letta-evals run suite.yaml --max-concurrent 10
```

Higher values = faster evaluation but more resource usage.

#### --api-key
Letta API key (overrides LETTA_API_KEY environment variable).

```bash
letta-evals run suite.yaml --api-key your-key
```

#### --base-url
Letta server base URL (overrides suite config and environment variable).

```bash
letta-evals run suite.yaml --base-url http://localhost:8283
```

#### --project-id
Letta project ID for cloud deployments.

```bash
letta-evals run suite.yaml --project-id proj_abc123
```

#### --cached, -c
Path to cached results (JSONL) for re-grading trajectories without re-running the agent.

```bash
letta-evals run suite.yaml --cached previous_results.jsonl
```

Use this to test different graders on the same agent trajectories.

#### --num-runs
Run the evaluation multiple times to measure consistency. **Default**: 1

```bash
letta-evals run suite.yaml --num-runs 10
```

**Output with multiple runs:**
- Each run creates a separate `run_N/` directory with individual results
- An `aggregate_stats.json` file contains statistics across all runs (mean, standard deviation, pass rate)

### Examples

Basic run:
```bash
letta-evals run suite.yaml  # Run evaluation, show results in terminal
```

Save results:
```bash
letta-evals run suite.yaml --output evaluation-results/  # Save to directory
```

Letta Cloud:
```bash
letta-evals run suite.yaml \
  --base-url https://api.letta.com \
  --api-key $LETTA_API_KEY \
  --project-id proj_abc123
```

Quiet CI mode:
```bash
letta-evals run suite.yaml --quiet
if [ $? -eq 0 ]; then
  echo "Evaluation passed"
else
  echo "Evaluation failed"
  exit 1
fi
```

### Exit Codes

- `0`: Evaluation passed (gate criteria met)
- `1`: Evaluation failed (gate criteria not met or error)

## validate

Validate a suite configuration without running it.

```bash
letta-evals validate <suite.yaml>
```

Checks:
- YAML syntax is valid
- Required fields are present
- Paths exist
- Configuration is consistent
- Grader/extractor combinations are valid

Output on success:
```
✓ Suite configuration is valid
```

Output on error:
```
✗ Validation failed:
  - Agent file not found: agent.af
  - Grader 'my_metric' references unknown function
```

## list-extractors

List all available extractors.

```bash
letta-evals list-extractors
```

Output:
```
Available extractors:
  last_assistant      - Extract the last assistant message
  first_assistant     - Extract the first assistant message
  all_assistant       - Concatenate all assistant messages
  pattern             - Extract content matching regex
  tool_arguments      - Extract tool call arguments
  tool_output         - Extract tool return value
  after_marker        - Extract content after a marker
  memory_block        - Extract from memory block (requires agent_state)
```

## list-graders

List all available grader functions.

```bash
letta-evals list-graders
```

Output:
```
Available graders:
  exact_match              - Exact string match with ground_truth
  contains                 - Check if contains ground_truth
  regex_match              - Match regex pattern
  ascii_printable_only     - Validate ASCII-only content
```

## help

Show help information.

```bash
letta-evals --help
```

Show help for a specific command:

```bash
letta-evals run --help
letta-evals validate --help
```

## Environment Variables

### LETTA_API_KEY
API key for Letta authentication.

```bash
export LETTA_API_KEY=your-key-here
```

### LETTA_BASE_URL
Letta server base URL.

```bash
export LETTA_BASE_URL=http://localhost:8283
```

### LETTA_PROJECT_ID
Letta project ID (for cloud).

```bash
export LETTA_PROJECT_ID=proj_abc123
```

### OPENAI_API_KEY
OpenAI API key (for rubric graders).

```bash
export OPENAI_API_KEY=your-openai-key
```

## Configuration Priority

Configuration values are resolved in this order (highest to lowest priority):

1. CLI arguments (`--api-key`, `--base-url`, `--project-id`)
2. Suite YAML configuration
3. Environment variables

## Using in CI/CD

### GitHub Actions

```yaml
name: Run Evals
on: [push]

jobs:
  evaluate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2

      - name: Install dependencies
        run: pip install letta-evals

      - name: Run evaluation
        env:
          LETTA_API_KEY: ${{ secrets.LETTA_API_KEY }}
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
        run: |
          letta-evals run suite.yaml --quiet --output results/

      - name: Upload results
        uses: actions/upload-artifact@v2
        with:
          name: eval-results
          path: results/
```

### GitLab CI

```yaml
evaluate:
  script:
    - pip install letta-evals
    - letta-evals run suite.yaml --quiet --output results/
  artifacts:
    paths:
      - results/
  variables:
    LETTA_API_KEY: $LETTA_API_KEY
    OPENAI_API_KEY: $OPENAI_API_KEY
```

## Debugging

### Common Issues

<Warning>
**"Agent file not found"**

```bash
# Check file exists relative to suite YAML location
ls -la path/to/agent.af
```
</Warning>

<Warning>
**"Connection refused"**

```bash
# Verify Letta server is running
curl http://localhost:8283/v1/health
```
</Warning>

<Warning>
**"Invalid API key"**

```bash
# Check environment variable is set
echo $LETTA_API_KEY
```
</Warning>

## Next Steps

- [Understanding Results](/evals/results-metrics/understanding-results) - Interpreting evaluation output
- [Suite YAML Reference](/evals/configuration/suite-yaml-reference) - Complete configuration options
- [Getting Started](/evals/get-started/getting-started) - Complete tutorial with examples