Specifications
| Property | Value |
|---|---|
| Parameters | 24B (2B active) |
| Context Length | 32K tokens |
| Architecture | LFM2 (MoE) |
MoE Efficiency
24B quality, 2B inference cost
Laptop-Ready
Runs on laptops and single GPUs
Tool Calling
Native function calling support
Quick Start
- Transformers
- llama.cpp
- vLLM