> ## Documentation Index
> Fetch the complete documentation index at: https://docs.liquid.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# OpenAI-Compatible Client

> Lightweight client for OpenAI-compatible chat completions APIs — ideal for hybrid on-device + cloud routing.

`LeapOpenAIClient` / `leap-openai-client` (introduced in v0.10.0) is a small, dependency-light client for any OpenAI-compatible chat-completions endpoint — OpenAI itself, OpenRouter, vLLM, llama-server, or your own proxy. It ships in the same SDK release as `LeapSDK`, so you can route requests between an on-device LFM and a cloud model from a single app.

## When to use it

* **Hybrid on-device + cloud routing.** Run small / fast models on-device with `LeapSDK`, fall back to a larger cloud model for hard prompts.
* **Standardised cloud API.** Talk to any OpenAI-compatible backend without pulling in a heavier OpenAI SDK.
* **Streaming first.** SSE streaming is the only mode — non-streaming requests aren't exposed. `streamChatCompletion(...)` forces `stream = true` on the outgoing request regardless of the `stream` field on the `ChatCompletionRequest` you pass in.

## Add the dependency

<Tabs>
  <Tab title="iOS / macOS (SPM)">
    Add the `LeapOpenAIClient` product to your target. See the [Quick Start](./quick-start#2-install-the-sdk) for the full SPM setup.

    ```swift theme={"theme":{"light":"github-light","dark":"github-dark"}}
    dependencies: [
        .package(url: "https://github.com/Liquid4All/leap-sdk.git", from: "0.10.7")
    ]

    targets: [
        .target(
            name: "YourApp",
            dependencies: [
                .product(name: "LeapOpenAIClient", package: "leap-sdk"),
            ]
        )
    ]
    ```

    In Swift sources, `import LeapOpenAIClient`. The Darwin (URLSession) Ktor engine is bundled — no extra HTTP setup needed.
  </Tab>

  <Tab title="Android (Gradle)">
    ```kotlin theme={"theme":{"light":"github-light","dark":"github-dark"}}
    dependencies {
      implementation("ai.liquid.leap:leap-sdk:0.10.7")
      implementation("ai.liquid.leap:leap-openai-client:0.10.7")
    }
    ```

    Bundles an OkHttp-engine Ktor client. No extra HTTP setup needed.
  </Tab>

  <Tab title="JVM (Gradle)">
    ```kotlin theme={"theme":{"light":"github-light","dark":"github-dark"}}
    dependencies {
        implementation("ai.liquid.leap:leap-sdk:0.10.7")
        implementation("ai.liquid.leap:leap-openai-client:0.10.7")
    }
    ```

    JVM support landed in v0.10.7 (the `jvm` slice was absent in the v0.10.0–v0.10.6 cascade). Pure-Maven JVM projects should consume the `-jvm` classifier directly: `ai.liquid.leap:leap-openai-client-jvm:0.10.7`. Bundles the CIO Ktor engine.
  </Tab>

  <Tab title="Kotlin/Native (Gradle)">
    ```kotlin theme={"theme":{"light":"github-light","dark":"github-dark"}}
    dependencies {
        implementation("ai.liquid.leap:leap-sdk:0.10.7")
        implementation("ai.liquid.leap:leap-openai-client:0.10.7")
    }
    ```

    Targets `linuxX64`, `linuxArm64`, `mingwX64` (Windows native), and `wasmJs` (browser via Ktor Js engine, added in v0.10.7).
  </Tab>
</Tabs>

## Basic usage

<Tabs>
  <Tab title="Swift (iOS / macOS)">
    <Warning>
      The `leap-sdk-openai-client` Kotlin module does **not** apply the SKIE plugin in v0.10.7 (only `leap-sdk`, `leap-sdk-model-downloader`, and `leap-ui` do). That means `Flow<ChatCompletionEvent>` is **not** bridged to a Swift `AsyncSequence` and the `onEnum(of:)` helper is **not** generated for `ChatCompletionEvent`. Swift consumers on v0.10.7 must collect the Kotlin `Flow` through its native collector and downcast each event with `as?`. For most Swift apps that just need cloud chat completions, an off-the-shelf OpenAI Swift client is more ergonomic — use `LeapOpenAIClient` from Swift only if you need to share Kotlin code with Android.

      **Coming in the next release:** SKIE will be enabled on `leap-sdk-openai-client`, adding the same Swift-friendly surface as `LeapSDK` — `for try await event in client.streamChatCompletion(...)`, `onEnum(of: event)` exhaustive switching, and nested-class Swift names (`ChatCompletionEvent.Delta` instead of the current flattened `ChatCompletionEventDelta`). Swift convenience inits and builders for `OpenAiClientConfig` are also planned. Pin to v0.10.7 if you need the current behavior frozen; otherwise expect the more ergonomic surface to land soon.
    </Warning>

    Manual collection pattern (the `Flow<ChatCompletionEvent>.collect(...)` shape varies by Kotlin/Native version — check the framework header in your Xcode build for the exact label):

    ```swift theme={"theme":{"light":"github-light","dark":"github-dark"}}
    import LeapOpenAIClient

    // The Kotlin top-level `fun OpenAiClient(config: OpenAiClientConfig)` exports as
    // `OpenAiClientKt.OpenAiClient(config:)` (PascalCase preserved from the Kotlin
    // function name). Without SKIE the K/N export also flattens Kotlin's nested
    // class names — `ChatMessage.User` → `ChatMessageUser`,
    // `ChatCompletionEvent.Delta` → `ChatCompletionEventDelta`, etc.
    let client = OpenAiClientKt.OpenAiClient(
        config: OpenAiClientConfig(
            apiKey: "sk-…",
            baseUrl: "https://api.openai.com/v1"
        )
    )

    let request = ChatCompletionRequest(
        model: "gpt-4o-mini",
        messages: [
            ChatMessageSystem(content: "You are a helpful assistant."),
            ChatMessageUser(content: "What is the capital of Japan?")
        ],
        temperature: 0.7
    )

    // Pseudocode — actual collector signature depends on your Kotlin/Native version
    // and framework headers. Without SKIE, there is no `for try await` integration.
    try await client.streamChatCompletion(request: request).collect(
        collector: FlowCollector { event in
            if let delta = event as? ChatCompletionEventDelta {
                print(delta.content, terminator: "")
            } else if let done = event as? ChatCompletionEventDone {
                if let usage = done.usage { print("\nTokens: \(usage.totalTokens)") }
            } else if let err = event as? ChatCompletionEventError {
                print("\nError: \(err.message)")
            }
        }
    )

    client.close()  // closes the underlying URLSession-backed HttpClient
    ```
  </Tab>

  <Tab title="Kotlin (all platforms)">
    ```kotlin theme={"theme":{"light":"github-light","dark":"github-dark"}}
    import ai.liquid.leap.openai.ChatCompletionEvent
    import ai.liquid.leap.openai.ChatCompletionRequest
    import ai.liquid.leap.openai.ChatMessage
    import ai.liquid.leap.openai.OpenAiClient
    import ai.liquid.leap.openai.OpenAiClientConfig

    val client = OpenAiClient(
        config = OpenAiClientConfig(
            apiKey = "sk-…",
            baseUrl = "https://api.openai.com/v1",
        )
    )

    val request = ChatCompletionRequest(
        model = "gpt-4o-mini",
        messages = listOf(
            ChatMessage.System("You are a helpful assistant."),
            ChatMessage.User("What is the capital of Japan?"),
        ),
        temperature = 0.7,
    )

    client.streamChatCompletion(request).collect { event ->
        when (event) {
            is ChatCompletionEvent.Delta -> print(event.content)
            is ChatCompletionEvent.Done  -> event.usage?.let { println("\nTokens: ${it.totalTokens}") }
            is ChatCompletionEvent.Error -> println("\nError: ${event.message}")
        }
    }

    client.close()
    ```
  </Tab>
</Tabs>

## Configuration

`OpenAiClientConfig` is a Kotlin data class bridged identically on every platform.

```kotlin theme={"theme":{"light":"github-light","dark":"github-dark"}}
data class OpenAiClientConfig(
    val apiKey: String,
    val baseUrl: String = "https://api.openai.com/v1",
    val chatCompletionsPath: String = "/chat/completions",
    val extraHeaders: Map<String, String> = emptyMap(),
)
```

| Field                 | Default                     | Notes                                                         |
| --------------------- | --------------------------- | ------------------------------------------------------------- |
| `apiKey`              | — (required)                | Sent as `Authorization: Bearer <apiKey>`.                     |
| `baseUrl`             | `https://api.openai.com/v1` | Override for OpenRouter, a self-hosted backend, etc.          |
| `chatCompletionsPath` | `/chat/completions`         | Appended to `baseUrl`.                                        |
| `extraHeaders`        | `{}`                        | Merged into every request — e.g. OpenRouter's `HTTP-Referer`. |

### OpenRouter

<Tabs>
  <Tab title="Swift (iOS / macOS)">
    ```swift theme={"theme":{"light":"github-light","dark":"github-dark"}}
    // The leap-sdk-openai-client module has no SKIE plugin applied, so the
    // top-level Kotlin `fun OpenAiClient(config:)` factory is exported as
    // `OpenAiClientKt.OpenAiClient(config:)`. See the [Basic usage](#basic-usage)
    // warning for the full reasoning.
    let client = OpenAiClientKt.OpenAiClient(
        config: OpenAiClientConfig(
            apiKey: "sk-or-…",
            baseUrl: "https://openrouter.ai/api/v1",
            extraHeaders: [
                "HTTP-Referer": "https://yourapp.example.com",
                "X-Title": "Your App"
            ]
        )
    )
    ```
  </Tab>

  <Tab title="Kotlin (all platforms)">
    ```kotlin theme={"theme":{"light":"github-light","dark":"github-dark"}}
    val client = OpenAiClient(
        OpenAiClientConfig(
            apiKey = "sk-or-…",
            baseUrl = "https://openrouter.ai/api/v1",
            extraHeaders = mapOf(
                "HTTP-Referer" to "https://yourapp.example.com",
                "X-Title" to "Your App",
            ),
        )
    )
    ```
  </Tab>
</Tabs>

### Self-hosted vLLM / llama-server

<Tabs>
  <Tab title="Swift (iOS / macOS)">
    ```swift theme={"theme":{"light":"github-light","dark":"github-dark"}}
    let client = OpenAiClientKt.OpenAiClient(
        config: OpenAiClientConfig(
            apiKey: "anything",  // Required by config but typically unused
            baseUrl: "http://10.0.0.42:8000/v1"
        )
    )
    ```
  </Tab>

  <Tab title="Kotlin (all platforms)">
    ```kotlin theme={"theme":{"light":"github-light","dark":"github-dark"}}
    val client = OpenAiClient(
        OpenAiClientConfig(
            apiKey = "anything",
            baseUrl = "http://10.0.0.42:8000/v1",
        )
    )
    ```
  </Tab>
</Tabs>

## Request shape

`ChatCompletionRequest` covers standard OpenAI fields plus a few OpenRouter-specific extensions. OpenRouter-only fields are silently ignored by stock OpenAI-compatible APIs.

```kotlin theme={"theme":{"light":"github-light","dark":"github-dark"}}
data class ChatCompletionRequest(
    val model: String,
    val messages: List<ChatMessage>,
    val temperature: Double? = null,
    val topP: Double? = null,
    val maxCompletionTokens: Int? = null,   // Preferred for newer OpenAI versions
    val maxTokens: Int? = null,             // Legacy alias — some custom backends still require it
    val frequencyPenalty: Double? = null,
    val presencePenalty: Double? = null,
    val stop: List<String>? = null,
    val stream: Boolean = true,
    // OpenRouter extensions:
    val topK: Int? = null,
    val repetitionPenalty: Double? = null,
    val minP: Double? = null,
    val topA: Double? = null,
    val transforms: List<String>? = null,
    val models: List<String>? = null,
    val route: String? = null,
    val provider: ProviderPreferences? = null,
)
```

`ChatMessage` (the OpenAI-client one, distinct from `LeapSDK.ChatMessage`) is a sealed type with three cases — `System`, `User`, `Assistant`.

## Response shape

`streamChatCompletion(request)` returns a `Flow<ChatCompletionEvent>` (Kotlin) — and the same `Flow` is exposed verbatim to Swift in v0.10.7 (no SKIE on this module yet, so it's not bridged to a Swift `AsyncSequence`; collect it via the native `Flow.collect(...)` shape shown above). Events:

| Variant                  | Meaning                                                                    |
| ------------------------ | -------------------------------------------------------------------------- |
| `Delta(content: String)` | Text chunk from the model. May be empty for role-only deltas.              |
| `Done(usage: Usage?)`    | Stream finished. `usage` is non-`null` when the API includes token counts. |
| `Error(message: String)` | HTTP error or stream parsing failure.                                      |

```kotlin theme={"theme":{"light":"github-light","dark":"github-dark"}}
data class Usage(val promptTokens: Int, val completionTokens: Int, val totalTokens: Int)
```

## Hybrid routing example

Route simple prompts to a small on-device LFM; escalate harder prompts to a cloud model.

<Tabs>
  <Tab title="Swift (iOS / macOS)">
    ```swift theme={"theme":{"light":"github-light","dark":"github-dark"}}
    import LeapModelDownloader
    import LeapOpenAIClient

    @MainActor
    final class HybridChatViewModel: ObservableObject {
        private let onDevice: Conversation
        private let cloud: OpenAiClient

        init(onDevice: Conversation, cloud: OpenAiClient) {
            self.onDevice = onDevice
            self.cloud = cloud
        }

        func send(_ text: String, useCloud: Bool) async throws {
            if useCloud {
                // Cloud path: leap-sdk-openai-client has no SKIE — collect the Kotlin
                // Flow manually and downcast each event with `as?`. Note the flattened
                // Swift type names (`ChatMessageUser`, `ChatCompletionEventDelta`).
                let request = ChatCompletionRequest(
                    model: "gpt-4o-mini",
                    messages: [ChatMessageUser(content: text)]
                )
                try await cloud.streamChatCompletion(request: request).collect(
                    collector: FlowCollector { event in
                        if let delta = event as? ChatCompletionEventDelta {
                            appendChunk(delta.content)
                        }
                    }
                )
            } else {
                // On-device path: leap-sdk has SKIE — `for try await` + `onEnum(of:)`
                // work as written.
                let userMessage = ChatMessage(role: .user, textContent: text)
                for try await response in onDevice.generateResponse(message: userMessage) {
                    if case let .chunk(c) = onEnum(of: response) { appendChunk(c.text) }
                }
            }
        }

        private func appendChunk(_ text: String) { /* … */ }

        deinit { cloud.close() }
    }
    ```
  </Tab>

  <Tab title="Kotlin (Android)">
    ```kotlin theme={"theme":{"light":"github-light","dark":"github-dark"}}
    import ai.liquid.leap.Conversation
    import ai.liquid.leap.message.MessageResponse
    import ai.liquid.leap.openai.ChatCompletionEvent
    import ai.liquid.leap.openai.ChatCompletionRequest
    import ai.liquid.leap.openai.ChatMessage as CloudChatMessage
    import ai.liquid.leap.openai.OpenAiClient
    import ai.liquid.leap.message.ChatMessage
    import ai.liquid.leap.message.ChatMessageContent
    import androidx.lifecycle.ViewModel
    import androidx.lifecycle.viewModelScope
    import kotlinx.coroutines.launch

    class HybridChatViewModel(
        private val onDevice: Conversation,
        private val cloud: OpenAiClient,
    ) : ViewModel() {

        fun send(text: String, useCloud: Boolean) = viewModelScope.launch {
            if (useCloud) {
                val request = ChatCompletionRequest(
                    model = "gpt-4o-mini",
                    messages = listOf(CloudChatMessage.User(text)),
                )
                cloud.streamChatCompletion(request).collect { event ->
                    if (event is ChatCompletionEvent.Delta) appendChunk(event.content)
                }
            } else {
                val message = ChatMessage(
                    role = ChatMessage.Role.USER,
                    content = listOf(ChatMessageContent.Text(text)),
                )
                onDevice.generateResponse(message).collect { resp ->
                    if (resp is MessageResponse.Chunk) appendChunk(resp.text)
                }
            }
        }

        private fun appendChunk(text: String) { /* … */ }

        override fun onCleared() {
            super.onCleared()
            cloud.close()
        }
    }
    ```
  </Tab>

  <Tab title="Kotlin (JVM / native)">
    ```kotlin theme={"theme":{"light":"github-light","dark":"github-dark"}}
    suspend fun hybridSend(
        onDevice: Conversation,
        cloud: OpenAiClient,
        text: String,
        useCloud: Boolean,
    ) {
        if (useCloud) {
            val request = ChatCompletionRequest(
                model = "gpt-4o-mini",
                messages = listOf(CloudChatMessage.User(text)),
            )
            cloud.streamChatCompletion(request).collect { event ->
                if (event is ChatCompletionEvent.Delta) print(event.content)
            }
        } else {
            onDevice.generateResponse(text).collect { resp ->
                if (resp is MessageResponse.Chunk) print(resp.text)
            }
        }
    }
    ```
  </Tab>
</Tabs>

See [Cloud AI Comparison](./cloud-ai-comparison) for a side-by-side feature breakdown.

## Lifecycle

The platform `OpenAiClient(config:)` factory (Kotlin `fun OpenAiClient(config:)` → Swift `OpenAiClientKt.OpenAiClient(config:)`) creates an `HttpClient` internally and ties it to the returned client — call `close()` when you're done.

<Tabs>
  <Tab title="Swift (iOS / macOS)">
    ```swift theme={"theme":{"light":"github-light","dark":"github-dark"}}
    deinit { client.close() }
    ```

    The lower-level constructor that accepts an externally-managed `HttpClient` is part of the Kotlin/Ktor surface and isn't a useful entry point from Swift — the Ktor engine machinery isn't bridged into the public Swift API. Use `OpenAiClientKt.OpenAiClient(config:)` and let the SDK own the session. If multiple consumers share a client, share the `OpenAiClient` instance and `close()` once at teardown.
  </Tab>

  <Tab title="Kotlin (all platforms)">
    ```kotlin theme={"theme":{"light":"github-light","dark":"github-dark"}}
    override fun onCleared() {
        super.onCleared()
        client.close()
    }
    ```

    If you need to share an `HttpClient` across multiple clients (e.g., you already manage one for other Ktor-based code), use the lower-level constructor that takes a pre-built `HttpClient` — you then own its lifetime and shouldn't call `close()` on the OpenAiClient:

    ```kotlin theme={"theme":{"light":"github-light","dark":"github-dark"}}
    val shared = HttpClient(OkHttp)  // your own instance
    val client = OpenAiClient(config = config, httpClient = shared)
    // Don't call client.close() — you own `shared` and decide when it dies
    ```
  </Tab>
</Tabs>