Top 3 Updates for Building with AI on Android at Google I/O ‘24

Posted by Terence Zhang – Developer Relations Engineer

At Google I/O, we unveiled a vision of Android reimagined with AI at its core. As Android developers, you’re at the forefront of this exciting shift. By embracing generative AI (Gen AI), you’ll craft a new breed of Android apps that offer your users unparalleled experiences and delightful features.

Gemini models are powering new generative AI apps both over the cloud and directly on-device. You can now build with Gen AI using our most capable models over the Cloud with the Google AI client SDK or Vertex AI for Firebase in your Android apps. For on-device, Gemini Nano is our recommended model. We have also integrated Gen AI into developer tools – Gemini in Android Studio supercharges your developer productivity.

Let’s walk through the major announcements for AI on Android from this year’s I/O sessions in more detail!

#1: Build AI apps leveraging cloud-based Gemini models

To kickstart your Gen AI journey, design the prompts for your use case with Google AI Studio. Once you are satisfied with your prompts, leverage the Gemini API directly into your app to access Google’s latest models such as Gemini 1.5 Pro and 1.5 Flash, both with one million token context windows (with two million available via waitlist for Gemini 1.5 Pro).

If you want to learn more about and experiment with the Gemini API, the Google AI SDK for Android is a great starting point. For integrating Gemini into your production app, consider using Vertex AI for Firebase (currently in Preview, with a full release planned for Fall 2024). This platform offers a streamlined way to build and deploy generative AI features.

We are also launching the first Gemini API Developer competition (terms and conditions apply). Now is the best time to build an app integrating the Gemini API and win incredible prizes! A custom Delorean, anyone?

#2: Use Gemini Nano for on-device Gen AI

While cloud-based models are highly capable, on-device inference enables offline inference, low latency responses, and ensures that data won’t leave the device.

At I/O, we announced that Gemini Nano will be getting multimodal capabilities, enabling devices to understand context beyond text – like sights, sounds, and spoken language. This will help power experiences like Talkback, helping people who are blind or have low vision interact with their devices via touch and spoken feedback. Gemini Nano with Multimodality will be available later this year, starting with Google Pixel devices.

We also shared more about AICore, a system service managing on-device foundation models, enabling Gemini Nano to run on-device inference. AICore provides developers with a streamlined API for running Gen AI workloads with almost no impact on the binary size while centralizing runtime, delivery, and critical safety components for Gemini Nano. This frees developers from having to maintain their own models, and allows many applications to share access to Gemini Nano on the same device.

Gemini Nano is already transforming key Google apps, including Messages and Recorder to enable Smart Compose and recording summarization capabilities respectively. Outside of Google apps, we’re actively collaborating with developers who have compelling on-device Gen AI use cases and signed up for our Early Access Program (EAP), including Patreon, Grammarly, and Adobe.

Moving image of Gemini Nano operating in Adobe

Adobe is one of these trailblazers, and they are exploring Gemini Nano to enable on-device processing for part of its AI assistant in Acrobat, providing one-click summaries and allowing users to converse with documents. By strategically combining on-device and cloud-based Gen AI models, Adobe optimizes for performance, cost, and accessibility. Simpler tasks like summarization and suggesting initial questions are handled on-device, enabling offline access and cost savings. More complex tasks such as answering user queries are processed in the cloud, ensuring an efficient and seamless user experience.

This is just the beginning – later this year, we’ll be investing heavily to enable and aim to launch with even more developers.

To learn more about building with Gen AI, check out the I/O talks Android on-device GenAI under the hood and Add Generative AI to your Android app with the Gemini API, along with our new documentation.

#3: Use Gemini in Android Studio to help you be more productive

Besides powering features directly in your app, we’ve also integrated Gemini into developer tools. Gemini in Android Studio is your Android coding companion, bringing the power of Gemini to your developer workflow. Thanks to your feedback since its preview as Studio Bot at last year’s Google I/O, we’ve evolved our models, expanded to over 200 countries and territories, and now include this experience in stable builds of Android Studio.

At Google I/O, we previewed a number of features available to try in the Android Studio Koala preview release, like natural-language code suggestions and AI-assisted analysis for App Quality Insights. We also shared an early preview of multimodal input using Gemini 1.5 Pro, allowing you to upload images as part of your AI queries — enabling Gemini to help you build fully functional compose UIs from a wireframe sketch.

You can read more about the updates here, and make sure to check out What’s new in Android development tools.