LFM2-350M-PII-Extract-JP

← Back to Liquid Nanos LFM2-350M-PII-Extract-JP extracts personally identifiable information (PII) from Japanese text as structured JSON. Output can be used to mask sensitive information on-device for privacy-preserving applications.

HF GGUF

Specifications

Property	Value
Parameters	350M
Context Length	32K tokens
Task	PII Detection
Language	Japanese

Privacy Protection

On-device PII masking

Compliance

Data protection compliance

Document Redaction

Automated redaction

Prompting Recipe

Use temperature=0 (greedy decoding) for best results. This model is intended for single-turn conversations only.

System Prompt Format:

Extract <address>, <company_name>, <email_address>, <human_name>, <phone_number>

Extract specific entities by listing only what you need (e.g., Extract <human_name>). List categories in alphabetical order for optimal performance. Output Format: JSON with lists per category. Empty lists for missing entities. Outputs entities exactly as they appear (including notation variations) for exact-match masking.

Quick Start

Transformers
llama.cpp

Install:

pip install "transformers>=5.0.0" torch accelerate

Run:

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "LiquidAI/LFM2-350M-PII-Extract-JP"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")

system_prompt = "Extract <address>, <company_name>, <email_address>, <human_name>, <phone_number>"

user_input = """こんにちは、ラミンさんに B200 GPU を 10000 台 至急請求してください。
連絡先は celegans@liquid.ai (電話番号010-000-0000) で、これは C. elegans
線虫に着想を得たニューラルネットワークアーキテクチャを 今すぐ構築するために不可欠です。"""

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_input}
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", return_dict=True).to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0, do_sample=False)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
# Output: {"address": [], "company_name": [], "email_address": ["celegans@liquid.ai"],
#          "human_name": ["ラミン"], "phone_number": ["010-000-0000"]}

Download GGUF:

hf download LiquidAI/LFM2-350M-PII-Extract-JP-GGUF \
  --local-dir ./LFM2-350M-PII-Extract-JP-GGUF

Run:

llama-cli -m ./LFM2-350M-PII-Extract-JP-GGUF/LFM2-350M-PII-Extract-JP-Q4_K_M.gguf \
  -sys "Extract <human_name>, <phone_number>" \
  -p "田中太郎様、電話番号は090-1234-5678です。" \
  --temp 0

Getting Started

Models

Key Concepts

Help

LFM2-350M-PII-Extract-JP

Specifications

Privacy Protection

Compliance

Document Redaction

Prompting Recipe

Quick Start

Getting Started

Models

Key Concepts

Help

​Specifications

Privacy Protection

Compliance

Document Redaction

​Prompting Recipe

​Quick Start

Specifications

Prompting Recipe

Quick Start