350M late-interaction ColBERT model for multilingual retrieval and reranking
← Back to Liquid NanosLFM2.5-ColBERT-350M is a late-interaction retrieval model. It creates 128-dimensional vectors per token and scores query/document matches with MaxSim, which improves retrieval quality and generalization at the cost of a larger index.
Use LFM2.5-ColBERT-350M when you want stronger retrieval or reranking quality and can afford a larger per-token index. Use LFM2.5-Embedding-350M when you need the smallest, fastest dense-vector index.
This model uses PyLate for indexing, retrieval, and reranking.
PyLate
Reranking
GGUF
Install:
pip install -U pylate
Index and retrieve documents:
from pylate import indexes, models, retrievemodel = models.ColBERT( model_name_or_path="LiquidAI/LFM2.5-ColBERT-350M", trust_remote_code=True,)model.tokenizer.pad_token = model.tokenizer.eos_tokenindex = indexes.PLAID( index_folder="pylate-index", index_name="index", override=True,)documents_ids = ["1", "2", "3"]documents = [ "Paris is the capital of France.", "Tokyo is the capital of Japan.", "Berlin is the capital of Germany.",]document_embeddings = model.encode( documents, batch_size=32, is_query=False, show_progress_bar=True,)index.add_documents( documents_ids=documents_ids, documents_embeddings=document_embeddings,)query_embeddings = model.encode( ["Which city is Japan's capital?"], batch_size=32, is_query=True, show_progress_bar=True,)retriever = retrieve.ColBERT(index=index)results = retriever.retrieve(query_embeddings=query_embeddings, k=10)print(results)
from pylate import models, rankmodel = models.ColBERT( model_name_or_path="LiquidAI/LFM2.5-ColBERT-350M", trust_remote_code=True,)queries = ["Which city is Japan's capital?"]documents = [[ "Paris is the capital of France.", "Tokyo is the capital of Japan.", "Berlin is the capital of Germany.",]]document_ids = [["fr", "jp", "de"]]query_embeddings = model.encode(queries, is_query=True)document_embeddings = model.encode(documents, is_query=False)reranked = rank.rerank( documents_ids=document_ids, queries_embeddings=query_embeddings, documents_embeddings=document_embeddings,)print(reranked)