Models
The LFM model collection includes general-purpose language models, vision-language models, task-specific models, and audio models across various parameter sizes.
- These models are built on the backbone of a new hybrid architecture that's designed for incredibly fast training and inference. Learn more in our blog post.
- All models support a 32k token text context length for extended conversations and document processing.
- Our models are compatible with various open-source deployment libraries including Transformers, llama.cpp, vLLM, MLX, Ollama, and our own edge deployment platform LEAP.
Complete Model Table
| Model | HF | GGUF | MLX | ONNX | Trainable? |
|---|---|---|---|---|---|
| LFM2 Text Models | |||||
| LFM2-8B-A1B | ✓ | ✓ | ✓ | ✗ | Yes (TRL) |
| LFM2-2.6B | ✓ | ✓ | ✓ | ✓ | Yes (TRL) |
| LFM2-1.2B | ✓ | ✓ | ✓ | ✓ | Yes (TRL) |
| LFM2-700M | ✓ | ✓ | ✓ | ✓ | Yes (TRL) |
| LFM2-350M | ✓ | ✓ | ✓ | ✓ | Yes (TRL) |
| LFM2-VL Models | |||||
| LFM2-VL-3B | ✓ | ✓ | ✓ | ✗ | Yes (TRL) |
| LFM2-VL-1.6B | ✓ | ✓ | ✓ | ✗ | Yes (TRL) |
| LFM2-VL-450M | ✓ | ✓ | ✓ | ✗ | Yes (TRL) |
| LFM2-Audio | |||||
| LFM2-Audio-1.5B | ✓ | ✗ | ✗ | ✗ | No |
| Liquid Nanos | |||||
| LFM2-1.2B-Extract | ✓ | ✓ | ✗ | ✓ | Yes (TRL) |
| LFM2-350M-Extract | ✓ | ✓ | ✗ | ✓ | Yes (TRL) |
| LFM2-350M-ENJP-MT | ✓ | ✓ | ✓ | ✓ | Yes (TRL) |
| LFM2-1.2B-RAG | ✓ | ✓ | ✗ | ✓ | Yes (TRL) |
| LFM2-1.2B-Tool | ✓ | ✓ | ✗ | ✓ | Yes (TRL) |
| LFM2-350M-Math | ✓ | ✓ | ✗ | ✓ | Yes (TRL) |
| LFM2-350M-PII-Extract-JP | ✓ | ✓ | ✗ | ✗ | Yes (TRL) |
| LFM2-ColBERT-350M | ✓ | ✗ | ✗ | ✗ | Yes (PyLate) |
💬 LFM2
LFM2 is a family of general-purpose text-only language models optimized for edge AI and on-device deployment.
| Model | Description |
|---|---|
LiquidAI/LFM2-8B-A1B | MoE model with 8B total parameters, 1.5B active per token for efficient inference. Best performance. |
LiquidAI/LFM2-2.6B | High-performance model balancing capability and efficiency. |
LiquidAI/LFM2-1.2B | Compact model for resource-constrained environments. |
LiquidAI/LFM2-700M | Lightweight model for edge deployment. |
LiquidAI/LFM2-350M | Tiny model for big data operations and edge deployment. Fastest inference. |
👁️ LFM2-VL
LFM2-VL is a family of Vision Language Models (VLMs) that support text and image as inputs and text as outputs. These models are built on the LFM2 text model backbone with dynamic, user-tunable SigLIP2 NaFlex image encoders (Base 86M and shape-optimized 400M variants).
| Model | Description |
|---|---|
LiquidAI/LFM2-VL-3B | Highest-capacity multimodal model with enhanced visual understanding and reasoning. |
LiquidAI/LFM2-VL-1.6B | Fast and capable model for scene understanding and other vision language tasks. |
LiquidAI/LFM2-VL-450M | Compact multimodal model for edge deployment and fast inference. |
🎵 LFM2-Audio
LFM2-Audio is a family of audio foundation models that support text and audio both as inputs and outputs.
| Model | Description |
|---|---|
LiquidAI/LFM2-Audio-1.5B | Audio-to-audio processing model for speech tasks, like chat, ASR, and TTS. |
🎯 Liquid Nanos
Liquid Nanos are task-specific models fine-tuned for specialized use cases.
| Model | Description |
|---|---|
LiquidAI/LFM2-1.2B-Extract | Extract important information from a wide variety of unstructured documents into structured outputs like JSON. |
LiquidAI/LFM2-350M-Extract | Smaller version of the extraction model. |
LiquidAI/LFM2-350M-ENJP-MT | Near real-time bi-directional Japanese/English translation of short-to-medium inputs. |
LiquidAI/LFM2-1.2B-RAG | Answer questions based on provided contextual documents, for use in RAG systems. |
LiquidAI/LFM2-1.2B-Tool | Efficient model optimized for concise and precise tool calling. See the Tool Use guide for details. |
LiquidAI/LFM2-350M-Math | Tiny reasoning model designed for tackling tricky math problems. |
LiquidAI/LFM2-350M-PII-Extract-JP | Extract personally identifiable information (PII) from Japanese text and output it in JSON format. |
LiquidAI/LFM2-ColBERT-350M | Embed documents and queries for fast retrieval and reranking across many languages. |
GGUF Models
GGUF quantized versions are available for all LFM2 models for efficient inference with llama.cpp, LM Studio, and Ollama. These models offer reduced memory usage and faster CPU inference.
To access our official GGUF models, append -GGUF to any model repository name (e.g., LiquidAI/LFM2-1.2B-GGUF). All models are available in multiple quantization levels (Q4_0, Q4_K_M, Q5_K_M, Q6_K, Q8_0, F16).
MLX Models
MLX quantized versions are available for many of the LFM2 model library for efficient inference on Apple Silicon with MLX. These models leverage unified memory architecture for optimal performance on M-series chips.
Browse all MLX-compatible models at mlx-community LFM2 models. All models are available in multiple quantization levels (4-bit, 5-bit, 6-bit, 8-bit, bf16).