free-coding-models — Find the Fastest Coding LLM in Seconds • Nguyen Van Duy Khiem

Introduction#

free-coding-models is a real-time TUI tool that pings 160 coding models from 20 providers simultaneously, helping you find the fastest LLM for your AI coding assistant.

license

Contributors

vava-nessa ↗ · erwinh22 ↗ · whit3rabbit ↗ · skylaweber ↗ · PhucTruong-ctrl ↗

💬 Let’s talk about the project on Discord ↗

By Vanessa Depraute

1. Create a free API key (NVIDIA, OpenRouter, Hugging Face, etc.)
2. npm i -g free-coding-models
3. free-coding-models

plaintext

Find the fastest coding LLM models in seconds

Ping free coding models from 20 providers in real-time — pick the best one for OpenCode, OpenClaw, or any AI coding assistant

⚠️ Beta notice
FCM Proxy V2 support for external tools is still in beta. Claude Code, Codex, Gemini, and the other proxy-backed launchers already work in many setups, but auth and startup edge cases can still fail while the integration stabilizes.

✨ Features#

🎯 Coding-focused — Only LLM models optimized for code generation, not chat or vision
🌐 Multi-provider — Models from NVIDIA NIM, Groq, Cerebras, SambaNova, OpenRouter, Hugging Face Inference, Replicate, DeepInfra, Fireworks AI, Codestral, Hyperbolic, Scaleway, Google AI, SiliconFlow, Together AI, Cloudflare Workers AI, Perplexity API, Alibaba Cloud (DashScope), ZAI, and iFlow
⚙️ Settings screen — Press P to manage provider API keys, enable/disable providers, access FCM Proxy V2 settings, and check/install updates
📡 FCM Proxy V2 — Built-in reverse proxy with multi-key rotation, rate-limit failover, and Anthropic wire format translation for Claude Code. Optional always-on background service keeps the proxy running 24/7
🚀 Parallel pings — All models tested simultaneously via native fetch
📊 Real-time animation — Watch latency appear live in alternate screen buffer
🏆 Smart ranking — Top 3 fastest models highlighted with medals 🥇🥈🥉
⏱ Adaptive monitoring — Starts in a fast 2s cadence for 60s, settles to 10s, slows to 30s after 5 minutes idle, and supports a forced 4s mode
📈 Rolling averages — Avg calculated from ALL successful pings since start
📊 Uptime tracking — Percentage of successful pings shown in real-time
📐 Stability score — Composite 0–100 score measuring consistency (p95, jitter, spikes, uptime)
📊 Token usage tracking — The proxy logs prompt+completion token usage per exact provider/model pair
📜 Request Log Overlay — Press X to inspect recent proxied requests and token usage
📋 Changelog Overlay — Press N to browse all versions in an index
🛠 MODEL_NOT_FOUND Rotation — If a provider returns 404, the TUI rotates through other providers for the same model
🔄 Auto-retry — Timeout models keep getting retried
🎮 Interactive selection — Navigate with arrow keys, press Enter to act
💻 OpenCode integration — Auto-detects NIM setup, sets model as default, launches OpenCode
🦞 OpenClaw integration — Sets selected model as default provider
🧰 Public tool launchers — 13 tool modes: OpenCode CLI, OpenCode Desktop, OpenClaw, Crush, Goose, Aider, Claude Code, Codex, Gemini, Qwen, OpenHands, Amp, and Pi
🔌 Install Endpoints flow — Press Y to install providers into tools
📝 Feature Request (J key) — Send anonymous feedback
🐛 Bug Report (I key) — Send anonymous bug reports
📶 Status indicators — UP ✅ · No Key 🔑 · Timeout ⏳ · Overloaded 🔥 · Not Found 🚫
🏷 Tier filtering — Filter models by tier letter (S, A, B, C)
⭐ Persistent favorites — Press F to pin/unpin models

📋 Requirements#

Before using free-coding-models, make sure you have:

Node.js 18+ — Required for native fetch API
At least one free API key — pick any of the providers below

Provider Setup Links#

Provider	Free Tier	Link
NVIDIA NIM	40 req/min	build.nvidia.com ↗
Groq	30-50 RPM	console.groq.com/keys ↗
Cerebras	Generous dev tier	cloud.cerebras.ai ↗
SambaNova	Dev tier generous	sambanova.ai/developers ↗
OpenRouter	50-1000 req/day on `:free`	openrouter.ai/keys ↗
Hugging Face	Free monthly credits	huggingface.co/settings/tokens ↗
Replicate	6 req/min	replicate.com/account/api-tokens ↗
DeepInfra	200 concurrent	deepinfra.com/login ↗
Fireworks AI	$1 free credits	fireworks.ai ↗
Mistral Codestral	30 req/min	codestral.mistral.ai ↗
Hyperbolic	$1 free trial	app.hyperbolic.ai/settings ↗
Scaleway	1M free tokens	console.scaleway.com/iam/api-keys ↗
Google AI Studio	14.4K req/day	aistudio.google.com/apikey ↗
SiliconFlow	Varies by model	cloud.siliconflow.cn/account/ak ↗
Together AI	Credits/promotions	api.together.ai/settings/api-keys ↗
Cloudflare Workers AI	10k neurons/day	dash.cloudflare.com ↗
Perplexity API	Tiered by spend	perplexity.ai/settings/api ↗
ZAI	Coding Plan subscription	z.ai ↗
Alibaba Cloud (DashScope)	1M free tokens	modelstudio.console.alibabacloud.com ↗

OpenRouter Free Tier Details#

OpenRouter provides free requests on free models (:free):

No credits (or <$10): 50 requests/day (20 req/min)
≥ $10 in credits: 1000 requests/day (20 req/min)

Key things to know:

Free models (:free) never consume your credits
Failed requests still count toward your daily quota
Quota resets every day at midnight UTC

📦 Installation#

npm install -g free-coding-models

bash

pnpm add -g free-coding-models

bash

bun add -g free-coding-models

bash

Or use directly with npx/pnpx/bunx:

npx free-coding-models
pnpx free-coding-models
bunx free-coding-models

bash

🆕 What’s New#

Version 0.3.5 fixes the main Claude Code proxy compatibility bug:

Claude Code beta-route requests now work — the proxy accepts Anthropic URLs like /v1/messages?beta=true
The fix was validated against the real claude binary

🚀 Usage#

# Just run it — starts in OpenCode CLI mode
free-coding-models

# Explicitly target OpenCode CLI
free-coding-models --opencode

# Explicitly target OpenCode Desktop
free-coding-models --opencode-desktop

# Explicitly target OpenClaw
free-coding-models --openclaw

# Launch other supported tools
free-coding-models --crush
free-coding-models --goose
free-coding-models --aider
free-coding-models --claude-code

# Show only top-tier models (A+, S, S+)
free-coding-models --best

# Analyze for 10 seconds and output the most reliable model
free-coding-models --fiable

# Output results as JSON
free-coding-models --json

# Filter by tier
free-coding-models --tier S          # S+ and S only
free-coding-models --tier A          # A+, A, A- only

# Combine flags freely
free-coding-models --openclaw --tier S
free-coding-models --opencode --best

bash

Setup Wizard (First Run)#

On first run, you’ll be walked through all 20 providers:

🔑 First-time setup — API keys
Enter keys for any provider you want to use. Press Enter to skip.

● NVIDIA NIM
  Free key at: https://build.nvidia.com
  Profile → API Keys → Generate
Enter key (or Enter to skip): nvapi-xxxx

● Groq
  Free key at: https://console.groq.com/keys
  API Keys → Create API Key
Enter key (or Enter to skip): gsk_xxxx

plaintext

Adding or Changing Keys Later#

Press P to open the Settings screen at any time:

↑↓ — navigate providers
Enter — edit the selected key
Space — toggle provider enabled/disabled
T — test the key with a live ping
Esc — close settings

Keys are saved to ~/.free-coding-models.json (permissions 0600).

Environment Variable Overrides#

Env vars always take priority over the config file:

NVIDIA_API_KEY=nvapi-xxx free-coding-models
GROQ_API_KEY=gsk_xxx free-coding-models
OPENROUTER_API_KEY=sk-or-xxx free-coding-models

bash

📊 TUI Columns#

The main table displays one row per model with the following columns:

Column	Sort Key	Description
Rank	`R`	Position based on current sort (medals 🥇🥈🥉 for top 3)
Tier	—	SWE-bench tier (S+, S, A+, A, A-, B+, B, C)
SWE%	`S`	SWE-bench Verified score — industry-standard for coding
CTX	`C`	Context window size (e.g. `128k`)
Model	`M`	Model display name (favorites show ⭐ prefix)
Provider	`O`	Provider name (NIM, Groq, etc.)
Latest Ping	`L`	Most recent round-trip latency in milliseconds
Avg Ping	`A`	Rolling average of ALL successful pings since launch
Health	`H`	Current status: UP ✅, NO KEY 🔑, Timeout ⏳, Overloaded 🔥
Verdict	`V`	Health verdict based on avg latency + stability
Stability	`B`	Composite 0–100 consistency score
Up%	`U`	Uptime — percentage of successful pings
Used	—	Total tokens consumed for this provider/model pair

Verdict Values#

Verdict	Meaning
Perfect	Avg < 400ms with stable p95/jitter
Normal	Avg < 1000ms, consistent responses
Slow	Avg 1000–2000ms
Spiky	Good avg but erratic tail latency
Very Slow	Avg 2000–5000ms
Overloaded	Server returned 429/503
Unstable	Was up but now timing out, or avg > 5000ms
Not Active	No successful pings yet
Pending	First ping still in flight

📐 Stability Score#

The Stability column shows a composite 0–100 score that answers: “How consistent and predictable is this model?”

Average latency alone is misleading — a model averaging 250ms that randomly spikes to 6 seconds feels slower than a steady 400ms model.

Formula#

Stability = 0.30 × p95_score
          + 0.30 × jitter_score
          + 0.20 × spike_score
          + 0.20 × reliability_score

plaintext

Component	Weight	What it measures
p95 latency	30%	Tail-latency spikes — the worst 5% of responses
Jitter (σ)	30%	Erratic response times — standard deviation
Spike rate	20%	Fraction of pings above 3000ms
Reliability	20%	Uptime — fraction of successful HTTP 200 pings

📡 FCM Proxy V2#

free-coding-models includes a local reverse proxy that merges all your provider API keys into one endpoint.

Disabled by default — enable in Settings (P) → FCM Proxy V2 settings.

What the Proxy Does#

Feature	Description
Unified endpoint	One URL (`http://127.0.0.1:18045/v1`) replaces 20+ provider endpoints
Key rotation	Automatically swaps to the next API key when one hits rate limits (429)
Usage tracking	Tracks token consumption per provider/model pair in real-time
Anthropic translation	Claude Code sends `POST /v1/messages` — proxy translates to OpenAI format
Path normalization	Converts non-standard API paths to standard `/v1/` calls

Quick Setup#

Via TUI:

Press P to open Settings
Select FCM Proxy V2 settings and press Enter
Enable Proxy mode, then select Install background service

Via CLI:

free-coding-models daemon install     # Install + start as OS service
free-coding-models daemon status      # Check running status
free-coding-models daemon restart     # Restart after config changes
free-coding-models daemon stop        # Graceful stop
free-coding-models daemon logs        # Show recent service logs

bash

Platform Support#

Platform	Service Type
macOS	`launchd` LaunchAgent
Linux	`systemd` user service
Windows	Falls back to in-process proxy

🤖 Coding Models#

160 coding models across 20 providers and 8 tiers, ranked by SWE-bench Verified ↗.

Tier Scale#

S+/S — Elite frontier coders (≥60% SWE-bench), best for complex real-world tasks
A+/A — Great alternatives, strong at most coding tasks
A-/B+ — Solid performers, good for targeted programming tasks
B/C — Lightweight or older models, good for code completion

Filtering by Tier#

free-coding-models --tier S     # Only S+ and S (frontier models)
free-coding-models --tier A     # Only A+, A, A- (solid performers)
free-coding-models --tier B     # Only B+, B (lightweight options)
free-coding-models --tier C     # Only C (edge/minimal models)

bash

Top Models by Provider#

Alibaba Cloud (DashScope) — 8 models:

S+: Qwen3 Coder Plus (69.6%), Qwen3 Coder 480B (70.6%)
S: Qwen3 Coder Max (67.0%), Qwen3 235B (70.0%)
A+: Qwen3 32B (50.0%)
A: Qwen2.5 Coder 32B (46.0%)

ZAI Coding Plan — 5 models:

S+: GLM-5 (77.8%), GLM-4.5 (75.0%), GLM-4.7 (73.8%)

NVIDIA NIM — 44 models:

S+: GLM 5 (77.8%), Kimi K2.5 (76.8%), DeepSeek V3.2 (73.1%)
S: DeepSeek V3.1 Terminus (68.4%), Llama 4 Maverick (62.0%)
A+: Mistral Large 675B (58.0%), QwQ 32B (50.0%)
A: Llama 3.1 405B (44.0%), R1 Distill 32B (43.9%)

Groq — 10 models:

S: Kimi K2 Instruct (65.8%), Llama 4 Maverick (62.0%)
A+: QwQ 32B (50.0%)
A: Llama 3.3 70B (39.5%)

🔌 OpenCode Integration#

The easiest way — let free-coding-models do everything:

Run: free-coding-models --opencode
Wait for models to be pinged (green ✅ status)
Navigate with ↑↓ arrows to your preferred model
Press Enter — tool automatically:
- Detects if NVIDIA NIM is configured in OpenCode
- Sets your selected model as default
- Launches OpenCode with the model pre-selected

tmux Sub-agent Panes#

When launched from an existing tmux session, free-coding-models auto-adds an OpenCode --port argument so OpenCode can spawn sub-agents in panes.

You can force a specific port:

OPENCODE_PORT=4098 free-coding-models --opencode

bash

ZAI Provider Proxy#

OpenCode doesn’t natively support ZAI’s API path format. When you select a ZAI model, free-coding-models automatically starts a local reverse proxy that translates OpenCode’s standard /v1/* requests to ZAI’s API.

🦞 OpenClaw Integration#

OpenClaw is an autonomous AI agent daemon. free-coding-models can configure it to use NVIDIA NIM models as its default provider.

Quick Start#

free-coding-models --openclaw

bash

Wait for models to be pinged
Navigate with ↑↓ arrows to your preferred model
Press Enter — tool automatically:
- Reads ~/.openclaw/openclaw.json
- Adds the nvidia provider block if missing
- Sets agents.defaults.model.primary to nvidia/<model-id>
- Saves config and prints next steps

Patching OpenClaw for Full NVIDIA Support#

By default, OpenClaw only allows a few specific NVIDIA models. To add ALL 47 NVIDIA models:

# From the free-coding-models package directory
node patch-openclaw.js

bash

This script:

Backs up existing config files
Adds all 47 NVIDIA models with proper context window and token limits
Preserves existing models and configuration

After patching, restart OpenClaw gateway:

systemctl --user restart openclaw-gateway

bash

⚙️ How It Works#

┌──────────────────────────────────────────────────────────────────┐
│  1. Enter alternate screen buffer (like vim/htop/less)           │
│  2. Ping ALL models in parallel                                  │
│  3. Display real-time table with Latest/Avg/Stability/Up%        │
│  4. Re-ping ALL models at 2s on startup, then 10s steady-state   │
│  5. Update rolling averages + stability scores per model         │
│  6. User can navigate with ↑↓ and select with Enter              │
│  7. On Enter: set model, launch selected tool                    │
│  8. Interface stays open until you select or press Ctrl+C        │
└──────────────────────────────────────────────────────────────────┘

plaintext

Result: Continuous monitoring interface with rolling averages, stability scores, and one-keystroke tool configuration.

🔧 Keyboard Shortcuts#

Main TUI#

Key	Action
↑↓	Navigate models
Enter	Select model and launch current target tool
R/S/C/M/O/L/A/H/V/B/U	Sort by Rank/SWE/Ctx/Model/Provider/Latest/Avg/Health/Verdict/Stability/Up%
F	Toggle favorite on selected model (⭐ pinned at top)
T	Cycle tier filter (All → S+ → S → A+ → A → …)
D	Cycle provider filter (All → NIM → Groq → …)
E	Toggle configured-only mode (persisted across sessions)
Z	Cycle target tool (OpenCode → OpenClaw → Crush → …)
X	Toggle request logs (recent proxied requests)
P	Open Settings (manage API keys, updates, profiles)
Y	Open Install Endpoints flow
Shift+P	Cycle through saved profiles
Shift+S	Save current TUI settings as a named profile
Q	Open Smart Recommend overlay
N	Open Changelog overlay
W	Cycle ping mode (FAST 2s → NORMAL 10s → SLOW 30s → FORCED 4s)
J	Open FCM Proxy V2 settings
I	Send feedback or bug reports
K / Esc	Show help / Close overlay
Ctrl+C	Exit

Settings Screen (P Key)#

Key	Action
↑↓	Navigate providers, maintenance, profiles
Enter	Edit API key, check/install update, load profile
Space	Toggle provider enabled/disabled
T	Test current provider’s API key
U	Check for updates manually
Backspace	Delete the selected profile
Esc	Close settings

📋 Config Profiles#

Profiles let you save and restore different TUI configurations — useful if you switch between work/personal setups or different tier preferences.

What’s stored:

Favorites (starred models)
Sort column and direction
Tier filter
Ping mode
Configured-only filter
API keys

Saving a profile:

Configure the TUI the way you want
Press Shift+S — an inline prompt appears
Type a name (e.g. work, fast-only) and press Enter

Switching profiles:

Shift+P in the main table — cycles through saved profiles
--profile <name> — load a specific profile on startup

📄 License#

MIT © vava ↗

📬 Contribute#

We welcome contributions! Feel free to open issues, submit pull requests, or get involved in the project.

Q: Can I use this with other providers?
A: Yes, the tool is designed to be extensible.

Q: How accurate are the latency numbers?
A: They represent average round-trip times measured during testing; actual performance may vary based on network conditions.

Q: Do I need to download models locally for OpenClaw?
A: No — free-coding-models configures OpenClaw to use NVIDIA NIM’s remote API. No GPU or local setup required.

📧 Support#

For questions or issues, open a GitHub issue ↗.

💬 Let’s talk about the project on Discord: https://discord.gg/ZTNFHvvCkU ↗

Built with ☕ and 🌹 by vava ↗

We collect anonymous usage data to improve the tool and fix bugs. No personal information is ever collected.