> ## Documentation Index
> Fetch the complete documentation index at: https://docs.liquid.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Fal

> Fal is a serverless generative media platform offering lightning-fast inference for AI models for image, video, and audio generation.

<Tip>
  Use Fal for serverless cloud deployments with lightning-fast inference, autoscaling, and easy API access.
</Tip>

## Clone the repository

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
git clone https://github.com/Liquid4All/lfm-inference
```

## Deployment

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
cd fal

# run one-off server
fal run deploy-lfm2.py::serve

# run production server
fal deploy deploy-lfm2.py::serve --app-name lfm2-8b --auth private
```

The first run will require extra time to download the docker image and model weights.

## Test call

First, create an API key [here](https://fal.ai/dashboard/keys).

Then run the following cURL commands:

```bash theme={"theme":{"light":"github-light","dark":"github-dark"}}
export FAL_API_KEY=<your-fal-api-key>

# List deployed model
curl https://fal.run/<org-id>/<app-id>/v1/models -H "Authorization: Key $FAL_API_KEY"

# Query the deployed LFM model
curl -X POST https://fal.run/<org-id>/<app-id>/v1/chat/completions \
  -H "Authorization: Key $FAL_API_KEY" \
  -d '{
    "model": "LiquidAI/LFM2-8B-A1B",
    "messages": [
      {
        "role": "user",
        "content": "What is the melting temperature of silver?"
      }
    ],
    "max_tokens": 32,
    "temperature": 0
  }'
```

<Note>
  Fal endpoints expect the `Key` prefix in the `Authorization` header.
</Note>
