Specifications
| Property | Value |
|---|---|
| Parameters | 8B (1.5B active) |
| Context Length | 32K tokens |
| Architecture | LFM2 (MoE) |
MoE Efficiency
8B quality, 1.5B inference cost
On-Device
Runs on phones and laptops
Tool Calling
Native function calling support
Quick Start
- Transformers
- llama.cpp
- vLLM
Install:Download & Run: