Lean-Agent-4B

Scout Tier

Lightweight Tool Calling for Self-Hosted Agents

A 4B parameter distilled model built for fast, reliable tool calling and structured JSON output. The smallest model in the Lean lineup — fits comfortably on any modern GPU with under 5GB VRAM.

Key Features

Performance

Benchmarks will be published with the first release.

Metric Lean-Agent-4B Qwen3-4B (baseline)
Tool calling accuracy
Structured output success
Avg latency
VRAM usage (Q8_0) ~4-5 GB ~4-5 GB

Pricing

One-time license: $20

Includes:

Getting Started

  1. Download the GGUF quantized weights
  2. Load with Ollama or llama.cpp
  3. Configure your agent to use lean-agent-4b

Also in the Scout tier: Lean-Agent-8B, Lean-Coder-8B, Lean-Agent-14B (coming soon)