Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.liquid.ai/llms.txt

Use this file to discover all available pages before exploring further.

LeapOpenAIClient / leap-openai-client (introduced in v0.10.0) is a small, dependency-light client for any OpenAI-compatible chat-completions endpoint β€” OpenAI itself, OpenRouter, vLLM, llama-server, or your own proxy. It ships in the same SDK release as LeapSDK, so you can route requests between an on-device LFM and a cloud model from a single app.

When to use it

  • Hybrid on-device + cloud routing. Run small / fast models on-device with LeapSDK, fall back to a larger cloud model for hard prompts.
  • Standardised cloud API. Talk to any OpenAI-compatible backend without pulling in a heavier OpenAI SDK.
  • Streaming first. SSE streaming is the only mode β€” non-streaming requests aren’t exposed (stream = true is the default).

Add the dependency

Add the LeapOpenAIClient product to your target. See the Quick Start for the full SPM setup.
dependencies: [
    .package(url: "https://github.com/Liquid4All/leap-sdk.git", from: "0.10.6")
]

targets: [
    .target(
        name: "YourApp",
        dependencies: [
            .product(name: "LeapOpenAIClient", package: "leap-sdk"),
        ]
    )
]
In Swift sources, import LeapOpenAIClient. The Darwin (URLSession) Ktor engine is bundled β€” no extra HTTP setup needed.

Basic usage

import LeapOpenAIClient

let client = OpenAiClient(
    config: OpenAiClientConfig(
        apiKey: "sk-…",
        baseUrl: "https://api.openai.com/v1"
    )
)

let request = ChatCompletionRequest(
    model: "gpt-4o-mini",
    messages: [
        ChatMessage.System(content: "You are a helpful assistant."),
        ChatMessage.User(content: "What is the capital of Japan?")
    ],
    temperature: 0.7
)

for try await event in client.streamChatCompletion(request: request) {
    switch onEnum(of: event) {
    case .delta(let d):
        print(d.content, terminator: "")
    case .done(let d):
        if let usage = d.usage {
            print("\nTokens: \(usage.totalTokens)")
        }
    case .error(let e):
        print("\nError: \(e.message)")
    }
}

client.close()  // closes the underlying URLSession-backed HttpClient

Configuration

OpenAiClientConfig is a Kotlin data class bridged identically on every platform.
data class OpenAiClientConfig(
    val apiKey: String,
    val baseUrl: String = "https://api.openai.com/v1",
    val chatCompletionsPath: String = "/chat/completions",
    val extraHeaders: Map<String, String> = emptyMap(),
)
FieldDefaultNotes
apiKeyβ€” (required)Sent as Authorization: Bearer <apiKey>.
baseUrlhttps://api.openai.com/v1Override for OpenRouter, a self-hosted backend, etc.
chatCompletionsPath/chat/completionsAppended to baseUrl.
extraHeaders{}Merged into every request β€” e.g. OpenRouter’s HTTP-Referer.

OpenRouter

let client = OpenAiClient(
    config: OpenAiClientConfig(
        apiKey: "sk-or-…",
        baseUrl: "https://openrouter.ai/api/v1",
        extraHeaders: [
            "HTTP-Referer": "https://yourapp.example.com",
            "X-Title": "Your App"
        ]
    )
)

Self-hosted vLLM / llama-server

let client = OpenAiClient(
    config: OpenAiClientConfig(
        apiKey: "anything",  // Required by config but typically unused
        baseUrl: "http://10.0.0.42:8000/v1"
    )
)

Request shape

ChatCompletionRequest covers standard OpenAI fields plus a few OpenRouter-specific extensions. OpenRouter-only fields are silently ignored by stock OpenAI-compatible APIs.
data class ChatCompletionRequest(
    val model: String,
    val messages: List<ChatMessage>,
    val temperature: Double? = null,
    val topP: Double? = null,
    val maxCompletionTokens: Int? = null,   // Preferred for newer OpenAI versions
    val maxTokens: Int? = null,             // Legacy alias β€” some custom backends still require it
    val frequencyPenalty: Double? = null,
    val presencePenalty: Double? = null,
    val stop: List<String>? = null,
    val stream: Boolean = true,
    // OpenRouter extensions:
    val topK: Int? = null,
    val repetitionPenalty: Double? = null,
    val minP: Double? = null,
    val topA: Double? = null,
    val transforms: List<String>? = null,
    val models: List<String>? = null,
    val route: String? = null,
    val provider: ProviderPreferences? = null,
)
ChatMessage (the OpenAI-client one, distinct from LeapSDK.ChatMessage) is a sealed type with three cases β€” System, User, Assistant.

Response shape

streamChatCompletion(request) returns an AsyncSequence<ChatCompletionEvent> (Swift) / Flow<ChatCompletionEvent> (Kotlin):
VariantMeaning
Delta(content: String)Text chunk from the model. May be empty for role-only deltas.
Done(usage: Usage?)Stream finished. usage is non-null when the API includes token counts.
Error(message: String)HTTP error or stream parsing failure.
data class Usage(val promptTokens: Int, val completionTokens: Int, val totalTokens: Int)

Hybrid routing example

Route simple prompts to a small on-device LFM; escalate harder prompts to a cloud model.
import LeapModelDownloader
import LeapOpenAIClient

@MainActor
final class HybridChatViewModel: ObservableObject {
    private let onDevice: Conversation
    private let cloud: OpenAiClient

    init(onDevice: Conversation, cloud: OpenAiClient) {
        self.onDevice = onDevice
        self.cloud = cloud
    }

    func send(_ text: String, useCloud: Bool) async throws {
        if useCloud {
            let request = ChatCompletionRequest(
                model: "gpt-4o-mini",
                messages: [ChatMessage.User(content: text)]
            )
            for try await event in cloud.streamChatCompletion(request: request) {
                if case let .delta(d) = onEnum(of: event) { appendChunk(d.content) }
            }
        } else {
            let userMessage = LeapModelDownloader.ChatMessage(role: .user, content: [.text(text)])
            for try await response in onDevice.generateResponse(message: userMessage) {
                if case let .chunk(c) = onEnum(of: response) { appendChunk(c.text) }
            }
        }
    }

    private func appendChunk(_ text: String) { /* … */ }

    deinit { cloud.close() }
}
See Cloud AI Comparison for a side-by-side feature breakdown.

Lifecycle

The platform OpenAiClient(config:) factory creates an HttpClient internally and ties it to the returned client β€” call close() when you’re done.
deinit { client.close() }
The lower-level constructor that accepts an externally-managed HttpClient is part of the Kotlin/Ktor surface and isn’t a useful entry point from Swift β€” the Ktor engine machinery isn’t bridged into the public Swift API. Use OpenAiClient(config:) and let the SDK own the session. If multiple consumers share a client, share the OpenAiClient instance and close() once at teardown.