Skip to main content

Invoice extractor tool

GitHub

A Python CLI that monitors a folder for new invoice files and automatically extracts key information from them using two Liquid Foundational Models.

This a practical example of building agentic workflows that run entirely on your local machine: no API keys, no cloud costs, no private data shared with third-parties.

What's inside?

In this example, you will learn how to:

  • Chain multiple Liquid Foundational Models to build a complete workflow that processes visual data (invoice images) and extracts structured information
  • Set up local AI inference using Ollama to run Liquid models entirely on your machine without requiring cloud services or API keys
  • Build a file monitoring system that automatically processes new files dropped into a directory
  • Extract text from images using the LFM2-VL-3B vision-language model for optical character recognition
  • Transform unstructured text into structured data using the LFM2-1.2B-Extract model for information extraction
  • Create agentic workflows that combine multiple AI models to solve real-world business problems while keeping your data private and secure

Understanding the architecture

When you drop an invoice photo into a watched directory, the tool uses a chain with 2 Liquid Foundational Models:

  1. LFM2-VL-3B extracts a raw textual description from an invoice picture.
  2. LFM2-1.2B-Extract tranforms the raw textual description into a structured record. This record is appended to a CSV file.

api-key-screenshot

Environment setup

You will need

  • Ollama to serve the Language Models locally.
  • uv to manage Python dependencies and run the application efficiently without creating virtual environments manually.

Install Ollama

Click to see installation instructions for your platform

macOS:

# Download and install from the website
# Visit: https://ollama.ai/download

# Or use Homebrew
brew install ollama

Linux:

curl -fsSL https://ollama.ai/install.sh | sh

Windows: Download the installer from https://ollama.ai/download

Install UV

Click to see installation instructions for your platform

macOS/Linux:

curl -LsSf https://astral.sh/uv/install.sh | sh

Windows:

powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"

How to run it?

Let's start by cloning the repository:

git clone https://github.com/Liquid4All/cookbook.git
cd cookbook/examples/invoice-parser

Then, run the application using the invoices that are already in the repository:

uv run python src/invoice_parser/main.py \
--dir invoices/ \
--image-model hf.co/LiquidAI/LFM2-VL-3B-GGUF:F16 \
--extractor-model hf.co/LiquidAI/LFM2-1.2B-Extract-GGUF:F16 \
--process-existing

Feel free to modify the path to the invoices directory and the model IDs to suit your needs.

Example: Using different model sizes

For example, you can use the 1.6B version of the VLM model and the 350M version of the extractor model as follows:

uv run python src/invoice_parser/main.py \
--dir invoices/ \
--image-model hf.co/LiquidAI/LFM2-VL-1.6B-GGUF:F16 \
--extractor-model hf.co/LiquidAI/LFM2-350M-Extract-GGUF:F16 \
--process-existing

If you have make installed, you can run the application with the following command:

make run

The data extracted from the invoices is be saved in the same directory as the invoices, in a file called bills.csv. If you open the file, you will see the following data:

processed_atfile_pathutilityamountcurrency
2025-10-31 11:25:47invoices/water_australia.pngelectricity68.46AUD
2025-10-31 11:26:00invoices/Sample-electric-Bill-2023.jpgelectricity28.32USD
2025-10-31 11:26:09invoices/british_gas.pngelectricity81.31GBP
2025-10-31 11:42:35invoices/castlewater1.pngelectricity150.0USD

Observations:

  • The first 3 invoices are properly extracted, with the correct amount and currency.
  • The fourth invoice is not properly extracted, where both amount and currency are not correct.

How to improve it?

We have a tool that works well 75% of the time on our sample of invoices, which is

  • good enough for a demo
  • not good enough for a production-ready application

To improve the tool you need to:

  • Collect more invoices
  • Collect model input and outputs at each step of the pipeline, including
    • the VLM model input and output
    • the extractor model input and output
  • Flag and correct (input, output) pairs that are not properly extracted
  • Fine-tune the model(s) on the corrected (input, output) pairs
tip

Our tool uses two Liquid Foundational Models:

  • LFM2-VL-3B for vision-language understanding
  • LFM2-1.2B-Extract for information extraction

LFM2-1.2B-Extract is a highly specialized model for structurd data extraction from text. So the problem is likely not here. On the other hand, LFM2-VL-3B is a more general-purpose model for vision-language understanding, which has not necessarily been trained on the task of invoice extraction. This is the first place to look for improvements.

If you are interested in learning more about model customization for Vision Language Models, we recommend you to check out the following example:

🚗
Car Maker Identification

Fine-tune LFM2-VL to identify car makers from images. Learn structured generation with Outlines and parameter-efficient fine-tuning with LoRA.

LaptopCustomization

Need help?

Join the Liquid AI Discord Community and ask.

Discord