LFM2.5-VL-450M-Extract

← Back to Liquid Nanos LFM2.5-VL-450M-Extract is the compact vision extraction Nano for turning images into structured JSON. It uses the same YAML-schema prompting pattern as the larger VL Extract model while keeping the deployment footprint small.

HF GGUF

Specifications

Property	Value
Parameters	450M total (350M LM + ~100M vision encoder)
Context Length	128K tokens
Image Input	Single image, dynamic resolution
Task	Vision structured extraction
Output Format	JSON

Edge Extraction

Structured image extraction for small devices.

Visual Tagging

Label image attributes with schema control.

Low Latency

Fast extraction for high-volume workflows.

Prompting Recipe

Use greedy decoding for best results. This model is intended for single-image, single-turn extraction.

Describe the fields to extract as YAML in the system prompt, then provide the image as the user message.

wood_color: The overall coloration of the wood surface
wood_texture: The tactile quality of the wood surface
wood_pattern: The pattern types visible on the wood surface

Quick Start

Transformers
llama.cpp

Install:

pip install "transformers>=5.1.0" pillow torch accelerate

Run:

from transformers import AutoModelForImageTextToText, AutoProcessor
from transformers.image_utils import load_image

model_id = "LiquidAI/LFM2.5-VL-450M-Extract"
model = AutoModelForImageTextToText.from_pretrained(
    model_id,
    device_map="auto",
    dtype="bfloat16",
    trust_remote_code=True,
)
processor = AutoProcessor.from_pretrained(model_id, trust_remote_code=True)

image = load_image("https://huggingface.co/LiquidAI/LFM2.5-VL-450M-Extract/resolve/main/sample_image.png")
fields_yaml = """wood_color: The overall coloration of the wood surface
wood_texture: The tactile quality of the wood surface
wood_pattern: The pattern types visible on the wood surface"""

system_prompt = f"""Extract the following from the image:

{fields_yaml}

Respond with only a JSON object. Do not include any text outside the JSON."""

conversation = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": [{"type": "image", "image": image}]},
]

inputs = processor.apply_chat_template(
    conversation,
    add_generation_prompt=True,
    return_tensors="pt",
    return_dict=True,
    tokenize=True,
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=512, do_sample=False)
response = processor.batch_decode(
    outputs[:, inputs["input_ids"].shape[1]:],
    skip_special_tokens=True,
)[0]
print(response)

Run GGUF:

llama-server -hf LiquidAI/LFM2.5-VL-450M-Extract-GGUF:Q4_K_M

llama-cli -hf LiquidAI/LFM2.5-VL-450M-Extract-GGUF:F16 \
  -p "wood_color: The overall coloration of the wood surface" \
  --image ./image.jpg

​Specifications

Edge Extraction

Visual Tagging

Low Latency

​Prompting Recipe

​Quick Start

Specifications

Prompting Recipe

Quick Start