Skip to main content
← Back to Text Models LFM2.5-8B-A1B is Liquid AI’s Mixture-of-Experts model, combining 8B total parameters with only 1.5B active parameters per forward pass with a 128K context window and chain of thought reasoning. This model delivers exceptional performance in tool calling and agentic tasks while running on-device.

Specifications

PropertyValue
Parameters8B (1.5B active)
Context Length128K tokens
ArchitectureLFM2.5 (MoE)

128K Context

Extended context window for long documents and conversations

MoE Efficiency

8B quality, 1.5B inference cost

Tool Calling

Native function calling for agentic workflows

Quick Start

Quick start with Transformers (compatible with transformers>=5.0.0):
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer

model_id = "LiquidAI/LFM2.5-8B-A1B"
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    dtype="bfloat16",
#   attn_implementation="flash_attention_2" <- uncomment on compatible GPU
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)

prompt = "What is C. elegans?"

input_ids = tokenizer.apply_chat_template(
    [{"role": "user", "content": prompt}],
    add_generation_prompt=True,
    return_tensors="pt",
    tokenize=True,
).to(model.device)

output = model.generate(
    input_ids,
    do_sample=True,
    temperature=0.2,
    top_k=80,
    repetition_penalty=1.05,
    max_new_tokens=8192,
    streamer=streamer,
)