Skip to main content
← Back to Liquid Nanos LFM2.5-ColBERT-350M is a late-interaction retrieval model. It creates 128-dimensional vectors per token and scores query/document matches with MaxSim, which improves retrieval quality and generalization at the cost of a larger index.
Use LFM2.5-ColBERT-350M when you want stronger retrieval or reranking quality and can afford a larger per-token index. Use LFM2.5-Embedding-350M when you need the smallest, fastest dense-vector index.

Specifications

PropertyValue
Parameters~353M
TypeLate interaction
Document Length512 tokens
Query Length32 tokens
Output128-dimensional vector per token
SimilarityMaxSim
Supported LanguagesEnglish, Spanish, German, French, Italian, Portuguese, Arabic, Swedish, Norwegian, Japanese, Korean

High-Quality Retrieval

Better matching from token-level interactions.

Reranking

Reorder candidates from a first-stage retriever.

Enterprise RAG

Strong multilingual document matching.

Quick Start

This model uses PyLate for indexing, retrieval, and reranking.
Install:
pip install -U pylate
Index and retrieve documents:
from pylate import indexes, models, retrieve

model = models.ColBERT(
    model_name_or_path="LiquidAI/LFM2.5-ColBERT-350M",
    trust_remote_code=True,
)
model.tokenizer.pad_token = model.tokenizer.eos_token

index = indexes.PLAID(
    index_folder="pylate-index",
    index_name="index",
    override=True,
)

documents_ids = ["1", "2", "3"]
documents = [
    "Paris is the capital of France.",
    "Tokyo is the capital of Japan.",
    "Berlin is the capital of Germany.",
]

document_embeddings = model.encode(
    documents,
    batch_size=32,
    is_query=False,
    show_progress_bar=True,
)
index.add_documents(
    documents_ids=documents_ids,
    documents_embeddings=document_embeddings,
)

query_embeddings = model.encode(
    ["Which city is Japan's capital?"],
    batch_size=32,
    is_query=True,
    show_progress_bar=True,
)

retriever = retrieve.ColBERT(index=index)
results = retriever.retrieve(query_embeddings=query_embeddings, k=10)
print(results)