LFM2-350M

← Back to Text Models LFM2-350M is Liquid AI’s smallest text model, designed for edge devices with strict memory and compute constraints. Delivers surprisingly strong performance for its size, making it ideal for low-latency applications.

HF GGUF MLX ONNX

Specifications

Property	Value
Parameters	350M
Context Length	32K tokens
Architecture	LFM2 (Dense)

Ultra-Light

Minimal memory and compute footprint

Low Latency

Fastest inference in the LFM family

Edge-Ready

Runs on IoT and embedded devices

Quick Start

Transformers
llama.cpp
vLLM

Install:

pip install "transformers>=5.0.0" torch accelerate

Download & Run:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "LiquidAI/LFM2-350M"
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    dtype="bfloat16",
)
tokenizer = AutoTokenizer.from_pretrained(model_id)

inputs = tokenizer.apply_chat_template(
    [{"role": "user", "content": "What is machine learning?"}],
    add_generation_prompt=True,
    return_tensors="pt",
    tokenize=True,
    return_dict=True,
).to(model.device)

output = model.generate(**inputs, do_sample=True, temperature=0.3, min_p=0.15, repetition_penalty=1.05, max_new_tokens=512)
input_length = inputs["input_ids"].shape[1]
response = tokenizer.decode(output[0][input_length:], skip_special_tokens=True)
print(response)

Install:

brew install llama.cpp

Run:

llama-cli -hf LiquidAI/LFM2-350M-GGUF -c 4096 --color -i \
    --temp 0.3 --min-p 0.15 --repeat-penalty 1.05

The -hf flag downloads the model directly from Hugging Face. For other installation methods and advanced usage, see the llama.cpp guide.

Install:

pip install vllm==0.14

Run:

from vllm import LLM, SamplingParams

llm = LLM(model="LiquidAI/LFM2-350M")

sampling_params = SamplingParams(temperature=0.3, min_p=0.15, repetition_penalty=1.05, max_tokens=512)

output = llm.chat("What is machine learning?", sampling_params)
print(output[0].outputs[0].text)

Getting Started

Models

Key Concepts

Help

Specifications

Ultra-Light

Low Latency

Edge-Ready

Quick Start

Getting Started

Models

Key Concepts

Help

​Specifications

Ultra-Light

Low Latency

Edge-Ready

​Quick Start

Specifications

Quick Start