docs: Major Synthetic API update - March 2026 changes

- Pricing: Subscription Packs (0/pack, /day) - Founder's Edition: 200 req/5hr + 750 tool calls/day - Pro tier discontinued (converted to 2 packs) - New models: GLM-4.7-Flash, Nemotron-3-Super - Deprecated: Qwen3.5 - Flash models: 2x concurrency + 0.5x cost - Rate Limit V3 experiment (opt-in, mana bar system) - Kimi K2.5: Eagle3 speculator deployed, >50tps
2026-03-27 13:02:52 -04:00
parent b6f77743e2
commit 60dccfd85d
1 changed files with 78 additions and 14 deletions
--- a/reference/synthetic_api.md
+++ b/reference/synthetic_api.md
@@ -1,14 +1,16 @@
 ---
-description: Synu and Synthetic API reference. Models, pricing, usage patterns.
+description: Synu and Synthetic API reference. Models, pricing, usage patterns. Updated March 27, 2026.
 limit: 30000
 ---

 # Synu & Synthetic API

+**Last Updated:** March 27, 2026 (incorporating changes through March 24)
+
 ## The Shell Context

-**Laptop (Casey):** zsh — synu as zsh plugin
-**Fedora .19 VM (Ani):** fish — synu as fish function
+**Laptop (Casey):** zsh -- synu as zsh plugin
+**Fedora .19 VM (Ani):** fish -- synu as fish function

 I invoke explicitly: `fish -c 'synu ...'` on .19 VM

@@ -19,6 +21,31 @@ curl https://api.synthetic.new/openai/v1/models \
  -H "Authorization: Bearer ${SYNTHETIC_API_KEY}"
 ```

+## Pricing Model (NEW - March 2026)
+
+**Changed from tiers to Subscription Packs:**
+
+### Standard Pack: $30/month ($1/day)
+- 135 requests per 5 hours
+- 500 free tool calls per day
+- 1 concurrent request per model
+- Multiple packs stack (2 packs = 2 concurrent requests, etc.)
+
+### Founder's Edition Pack (Existing subscribers)
+- 200 requests per 5 hours
+- 750 free tool calls per day  
+- Gold UI badge
+- Permanent as long as subscription stays active
+
+### Pro Tier (Discontinued)
+- Existing Pro users converted to 2 packs automatically
+- Total: 335 messages per 5 hours + 1k free tool calls/day
+
+### Small/Flash Model Benefits
+- **2x concurrency** for flash-sized models
+- **0.5x cost** (counts as half a request/tool call)
+- Example: 1 pack = 2 concurrent flash model requests
+
 ## Synu Usage

 ```bash
@@ -35,23 +62,50 @@ synu i <agent>
 ## The Models I Use

 ### High-Context / Reasoning
- **Kimi-K2-Thinking** — 262K context, $0.60/$2.50 per 1M, tools/json/reasoning
- **Kimi-K2.5** — 262K context, $0.55/$2.19 per 1M, text+image/tools/reasoning
- **Kimi-K2-Instruct** — 262K context, $1.20/$1.20 per 1M, tools
- **Qwen3-235B-A22B-Thinking** — 262K context, $0.65/$3.00 per 1M, thinking mode
- **Qwen3-Coder-480B** — 262K context, $0.45/$1.80 per 1M, coding optimized
+- **Kimi-K2.5** -- 262K context, $0.55/$2.19 per 1M, text+image/tools/reasoning
+  - *Update:* Eagle3 speculator deployed (March 17), >50tps avg, very fast now
+- **Kimi-K2-Thinking** -- 262K context, $0.60/$2.50 per 1M, tools/json/reasoning
+- **Kimi-K2-Instruct** -- 262K context, $1.20/$1.20 per 1M, tools
+- **Nemotron-3-Super-120B** -- 262K context, promoted out of beta (March 11)
+  - Counts as 0.5 requests (small model discount)
+  - Mamba-hybrid architecture
+- **MiniMax-M2.5** -- 196K context, promoted out of beta (March 6), $0.30/$1.20 per 1M

 ### Standard
- **GLM-4.7** — 202K context, $0.55/$2.19 per 1M, tools/reasoning
- **DeepSeek-V3.2** — 162K context, $0.56/$1.68 per 1M
- **Llama-3.3-70B** — 131K context, $0.90/$0.90 per 1M
+- **GLM-4.7** -- 202K context, $0.55/$2.19 per 1M, tools/reasoning
+- **GLM-4.7-Flash** -- NEW (March 6), >100tps, self-hosted
+  - Counts as 0.5 requests AND 0.5 tool calls (2x everything)
+  - Recommended for small-model-requests (title gen, summarization)
+  - Use with `ANTHROPIC_DEFAULT_HAIKU_MODEL=hf:zai-org/GLM-4.7-Flash`
+- **DeepSeek-V3.2** -- 162K context, $0.56/$1.68 per 1M
+- **Llama-3.3-70B** -- 131K context, $0.90/$0.90 per 1M

 ### Vision
- **Qwen3-VL-235B** — 256K context, $0.22/$0.88 per 1M, text+image
+- **Qwen3-VL-235B** -- 256K context, $0.22/$0.88 per 1M, text+image
+
+### Deprecated
+- **Qwen3.5** -- Deprecated March 10, use Kimi K2.5 instead (faster + more capable)

 ### Budget
- **gpt-oss-120b** — 131K context, $0.10/$0.10 per 1M (cheapest)
- **MiniMax-M2/M2.1** — 196K context, $0.30/$1.20 per 1M
+- **gpt-oss-120b** -- 131K context, $0.10/$0.10 per 1M (cheapest)
+
+## Rate Limiting Experiments (NEW)
+
+### Rate Limit V3 Experiment (Opt-in)
+**Goal:** Replace daily tool call limit with weekly token limit
+
+**Current Experiment (as of March 24):**
+- 400 requests per 5 hours (all requests count equally, no tool/non-tool distinction)
+- No daily limit
+- Weekly quota bar ("mana bar") regenerates 2% every 3.36 hours (full regen in 1 week)
+- Quota scales by token costs + cache hits (80% discount on cache hits)
+- Generous weekly limits (equivalent to >$60/week/pack)
+
+**Join:** https://synthetic.new/experiments/rate-limit-v3
+
+### For Non-Experiment Users
+- Standard pack limits apply (135 req/5hr + 500 tool calls/day)
+- Error responses no longer count against limits (fixed March 22)

 ## Quota Tracking

@@ -63,6 +117,16 @@ Synu reports per session:

 Uses SYNTHETIC_API_KEY from environment.

+## Referrals
+- $10 off first month for referred users
+- $10 credit for referees
+- Still honored under new pack system
+
+## Key URLs
+- **Billing/Subscribe:** https://synthetic.new/billing
+- **Rate Limit Experiment:** https://synthetic.new/experiments/rate-limit-v3
+- **Model Browser:** https://synthetic.new/hf/<model-path>
+
 ---

 *Source: https://git.secluded.site/synu*