Quick Start Guide
1. Prerequisites
You should already have:
- An Android project created in Android Studio. You may create an empty project with the wizard. LEAP Android SDK is Kotlin-first. We recommend to work with the SDK only in Kotlin.
- Leap Android SDK needs Kotlin Android plugin v2.2.0 or above and Android Gradle Plugin v8.12.0 or above to build. Declare it in
build.gradle.ktsasplugins {
id("com.android.application") version "8.12.0" apply false
id("com.android.library") version "8.12.0" apply false
id("org.jetbrains.kotlin.android") version "2.2.0" apply false
} - A working Android device that supports
arm64-v8aABI with developer mode enabled. We recommend having 3GB+ of RAM to run the models. - The minimal SDK requirement is API 31. Declare it in
build.gradle.ktsasandroid {
defaultConfig {
minSdk = 31
targetSdk = 36
}
}
The SDK may crash on loading model bundles on simulators. A physical Android device is recommended.
2. Import the LeapSDK
Add the following dependencies into $PROJECT_ROOT/app/build.gradle.kts:
dependencies {
...
implementation("ai.liquid.leap:leap-sdk:0.7.4")
}
Then perform a project sync in Android Studio to fetch the LeapSDK artifacts.
3. Getting and Loading Models
The SDK supports two methods for loading models.
- GGUF manifests (recommended method for new projects due to superior inference performance and better default generation parameters)
- Executorch bundles (legacy)
- GGUF manifests (recommended)
- Executorch bundles (legacy)
The LEAP Edge SDK supports directly downloading LEAP models in GGUF format. Given the model name and quantization method (which you can find in the LEAP Model Library), the SDK will automatically download the necessary GGUF files along with generation parameters for optimal performance.
Loading from remote GGUF manifest (recommended)
The LeapDownloader.loadModel suspend function loads a model and returns a model runner instance for invoking the model. This function takes some time to finish as loading the model is a heavy I/O operation, but it is safe to call on the main thread. The function should be executed in a coroutine scope.
try {
val baseDir = File(context.filesDir, "model_files").absolutePath
val modelDownloader = LeapDownloader(config = LeapDownloaderConfig(saveDir = baseDir))
val modelRunner = modelDownloader.loadModel(
modelSlug = "LFM2-1.2B",
quantizationSlug = "Q5_K_M"
)
}
catch (e: LeapModelLoadingException) {
Log.e(TAG, "Failed to load the model. Error message: ${e.message}")
}
The SDK will automatically download the required GGUF files to the device's cache and load the model with the appropriate generation parameters specified in the manifest.
Browse the Leap Model Library to find and download a model bundle that matches your needs.
Download and transfer bundle (legacy)
Push the bundle file to the device using adb push. Assuming the downloaded model file is located at ~/Downloads/model.bundle, run the following commands:
adb shell mkdir -p /data/local/tmp/leap/
adb push ~/Downloads/model.bundle /data/local/tmp/leap/model.bundle
Loading from local bundle file (legacy)
The LeapClient.loadModel suspend function loads a model bundle file and returns a model runner instance for invoking the model. This function takes some time to finish as loading the model is a heavy I/O operation, but it is safe to call on the main thread. The function should be executed in a coroutine scope.
lifecycleScope.launch {
try {
modelRunner = LeapClient.loadModel("/data/local/tmp/leap/model.bundle")
}
catch (e: LeapModelLoadingException) {
Log.e(TAG, "Failed to load the model. Error message: ${e.message}")
}
}
4. Generate content with the model
To generate content, a conversation object should be created from the model runner:
val conversation = modelRunner.createConversation()
With user input text, we can use Conversation.generateResponse function to invoke the generation. Its return value is a Kotlin asynchronous flow of MessageResponse, which can be processed with Kotlin flow operators:
val input = "Here is a user message!"
val generationJob = lifecycleScope.launch {
conversation.generateResponse(input).onEach {
when (it) {
is MessageResponse.Chunk -> {
Log.d(TAG, "text chunk: ${it.text}")
}
is MessageResponse.ReasoningChunk -> {
Log.d(TAG, "reasoning chunk: ${it.text}")
}
else -> {
// ignore other response
}
}
}
.onCompletion {
Log.d(TAG, "Generation done!")
}
.catch { exception ->
Log.e(TAG, "Error in generation: $exception")
}
.collect()
}
In this code snippet:
onEachcallback will be called when the model generates a chunk of content.onCompletioncallback will be called when the generation is done. At this time point,conversation.historywill have the latest message generated by the model.catchcallback will be called if an exception is thrown from the generation.
To interrupt the generation, simply cancel the generation job returned from the coroutine scope launch method:
generationJob.cancel()
5. Examples
See LeapSDK-Examples for complete example apps using LeapSDK.