Models

The LFM model collection includes general-purpose language models, vision-language models, task-specific models, and audio models across various parameter sizes.

These models are built on the backbone of a new hybrid architecture that's designed for incredibly fast training and inference. Learn more in our blog post.
All models support a 32k token text context length for extended conversations and document processing.
Our models are compatible with various open-source deployment libraries including Transformers, llama.cpp, vLLM, MLX, Ollama, and our own edge deployment platform LEAP.

Complete Model Table

Model	HF	GGUF	MLX	ONNX	Trainable?
LFM2 Text Models
LFM2-8B-A1B	✓	✓	✓	✗	Yes (TRL)
LFM2-2.6B	✓	✓	✓	✓	Yes (TRL)
LFM2-1.2B	✓	✓	✓	✓	Yes (TRL)
LFM2-700M	✓	✓	✓	✓	Yes (TRL)
LFM2-350M	✓	✓	✓	✓	Yes (TRL)
LFM2-VL Models
LFM2-VL-3B	✓	✓	✓	✗	Yes (TRL)
LFM2-VL-1.6B	✓	✓	✓	✗	Yes (TRL)
LFM2-VL-450M	✓	✓	✓	✗	Yes (TRL)
LFM2-Audio
LFM2-Audio-1.5B	✓	✗	✗	✗	No
Liquid Nanos
LFM2-1.2B-Extract	✓	✓	✗	✓	Yes (TRL)
LFM2-350M-Extract	✓	✓	✗	✓	Yes (TRL)
LFM2-350M-ENJP-MT	✓	✓	✓	✓	Yes (TRL)
LFM2-1.2B-RAG	✓	✓	✗	✓	Yes (TRL)
LFM2-1.2B-Tool	✓	✓	✗	✓	Yes (TRL)
LFM2-350M-Math	✓	✓	✗	✓	Yes (TRL)
LFM2-350M-PII-Extract-JP	✓	✓	✗	✗	Yes (TRL)
LFM2-ColBERT-350M	✓	✗	✗	✗	Yes (PyLate)

💬 LFM2

LFM2 is a family of general-purpose text-only language models optimized for edge AI and on-device deployment.

Model	Description
`LiquidAI/LFM2-8B-A1B`	MoE model with 8B total parameters, 1.5B active per token for efficient inference. Best performance.
`LiquidAI/LFM2-2.6B`	High-performance model balancing capability and efficiency.
`LiquidAI/LFM2-1.2B`	Compact model for resource-constrained environments.
`LiquidAI/LFM2-700M`	Lightweight model for edge deployment.
`LiquidAI/LFM2-350M`	Tiny model for big data operations and edge deployment. Fastest inference.

👁️ LFM2-VL

LFM2-VL is a family of Vision Language Models (VLMs) that support text and image as inputs and text as outputs. These models are built on the LFM2 text model backbone with dynamic, user-tunable SigLIP2 NaFlex image encoders (Base 86M and shape-optimized 400M variants).

Model	Description
`LiquidAI/LFM2-VL-3B`	Highest-capacity multimodal model with enhanced visual understanding and reasoning.
`LiquidAI/LFM2-VL-1.6B`	Fast and capable model for scene understanding and other vision language tasks.
`LiquidAI/LFM2-VL-450M`	Compact multimodal model for edge deployment and fast inference.

🎵 LFM2-Audio

LFM2-Audio is a family of audio foundation models that support text and audio both as inputs and outputs.

Model	Description
`LiquidAI/LFM2-Audio-1.5B`	Audio-to-audio processing model for speech tasks, like chat, ASR, and TTS.

🎯 Liquid Nanos

Liquid Nanos are task-specific models fine-tuned for specialized use cases.

Model	Description
`LiquidAI/LFM2-1.2B-Extract`	Extract important information from a wide variety of unstructured documents into structured outputs like JSON.
`LiquidAI/LFM2-350M-Extract`	Smaller version of the extraction model.
`LiquidAI/LFM2-350M-ENJP-MT`	Near real-time bi-directional Japanese/English translation of short-to-medium inputs.
`LiquidAI/LFM2-1.2B-RAG`	Answer questions based on provided contextual documents, for use in RAG systems.
`LiquidAI/LFM2-1.2B-Tool`	Efficient model optimized for concise and precise tool calling. See the Tool Use guide for details.
`LiquidAI/LFM2-350M-Math`	Tiny reasoning model designed for tackling tricky math problems.
`LiquidAI/LFM2-350M-PII-Extract-JP`	Extract personally identifiable information (PII) from Japanese text and output it in JSON format.
`LiquidAI/LFM2-ColBERT-350M`	Embed documents and queries for fast retrieval and reranking across many languages.

GGUF Models

GGUF quantized versions are available for all LFM2 models for efficient inference with llama.cpp, LM Studio, and Ollama. These models offer reduced memory usage and faster CPU inference.

To access our official GGUF models, append -GGUF to any model repository name (e.g., LiquidAI/LFM2-1.2B-GGUF). All models are available in multiple quantization levels (Q4_0, Q4_K_M, Q5_K_M, Q6_K, Q8_0, F16).

MLX Models

MLX quantized versions are available for many of the LFM2 model library for efficient inference on Apple Silicon with MLX. These models leverage unified memory architecture for optimal performance on M-series chips.

Browse all MLX-compatible models at mlx-community LFM2 models. All models are available in multiple quantization levels (4-bit, 5-bit, 6-bit, 8-bit, bf16).

💬 LFM2​

👁️ LFM2-VL​

🎵 LFM2-Audio​

🎯 Liquid Nanos​

GGUF Models​

MLX Models​