Android Developers Blog: Experimental hybrid inference and new Gemini models for Android

April 17, 2026 admin

If you are an Android developer looking to implement innovative AI features into your app, we recently launched powerful new updates: Hybrid inference, a new API for Firebase AI Logic to leverage both on-device and Cloud inference, and support for new Gemini models including the latest Nano Banana models for image generation.

Let’s jump in!

Experiment with hybrid inference

With the new Firebase API for hybrid inference, we implemented a simple rule-based routing approach as an initial solution to let you use both on-device and cloud inference via a unified API. We are planning on providing more sophisticated routing capabilities in the future.

It allows your app to dynamically switch between Gemini Nano running locally on the device and cloud-hosted Gemini models. The on-device execution uses ML Kit’s Prompt API. The cloud inference supports all the Gemini models from Firebase AI Logic in both Vertex AI and the Developer API.

To use it, add the firebase-ai-ondevice dependencies to your app along with Firebase AI Logic:

dependencies {
 [...] 
 implementation("com.google.firebase:firebase-ai:17.11.0")
 implementation("com.google.firebase:firebase-ai-ondevice:16.0.0-beta01")
}

During initialization, you create a GenerativeModel instance and configure it with specific inference modes, such as PREFER_ON_DEVICE (falls back to cloud if Gemini Nano is not available on the device) or PREFER_IN_CLOUD (falls back to on-device inference if offline):

val model = Firebase.ai(backend = GenerativeBackend.googleAI())
    .generativeModel(
        modelName = "gemini-3.1-flash-lite",
        onDeviceConfig = OnDeviceConfig(
           mode = InferenceMode.PREFER_ON_DEVICE
        )
    )

val response = model.generateContent(prompt)

The Firebase API for hybrid inference for Android is still experimental, and we encourage you to try it in your app, especially if you are already using Firebase AI Logic. Currently, on-device models are specialized for single-turn text generation based on text or single Bitmap image inputs. Review the limitations for more details.

We just published a new sample in the AI Sample Catalog leveraging the Firebase API for hybrid; it demonstrates how the Firebase API for hybrid inference can be used to generate a review based on a few selected topics and then translating it into various languages. Check out the code to see it in action!

The new hybrid inference sample in action

Try our new models

As part of the new Gemini models, we’ve released two models particularly helpful to Android developers and easy to integrate in your application via the Firebase AI Logic SDK.

Nano Banana

Last year we released Nano Banana, a state-of-the-art image generation model. And a few weeks ago, we released a couple of new Nano Banana models.

Nano Banana Pro (Gemini 3 Pro Image) is designed for professional asset production and can render high-fidelity text, even in a specific font or simulating different types of handwriting.

Nano Banana 2 (Gemini 3.1 Flash Image) is the high-efficiency counterpart to Nano Banana Pro. It’s optimized for speed and high-volume use cases. It can be used for a broad range of use cases (infographics, virtual stickers, contextual illustrations, etc.).

The new Nano Banana models leverage real-world knowledge and deep reasoning capabilities to generate precise and detailed images.

We updated our Magic Selfie sample (use image generation to change the background of your selfie!) to use Nano Banana 2. The background segmentation is now handled directly with the image generation model which makes the implementation easier and lets Nano Banana 2 improved image generation capabilities shine. See it in action here.

The updated Magic Selfie sample uses Nano Banana 2 to update a selfie background

You can use it via Firebase AI Logic SDK. Read more about it in the Android documentation.

Gemini 3.1 Flash-Lite

We also released Gemini 3.1 Flash-Lite, a new version of the Gemini Flash-Lite family. The Gemini Flash-Lite models have been particularly favored by Android developers for its good quality/latency ratio and low inference cost. It’s been used by Android developers for various use-cases such as in-app messaging translation or generating a recipe from a picture of a dish.

Gemini 3.1 Flash-Lite, currently in preview, will enable more advanced use cases with latency comparable to Gemini 2.5 Flash-Lite. To learn more about this model, review the Firebase documentation.

Conclusion

It’s a great time to explore the new Hybrid sample in our catalog to see these capabilities in action and understand the benefits of routing between on-device and cloud inference. We also encourage you to check out our documentation to test the new Gemini models.

Source link