Google has unveiled Gemma 4, a new open AI model designed to support local, on-device intelligence and agentic workflows across the Android development ecosystem, marking a shift toward embedding advanced AI capabilities directly into developer tools and mobile hardware.
The model is being introduced across two primary layers: local AI-assisted coding within Android Studio and on-device inference through Android’s machine learning stack. The rollout also includes access via an AICore Developer Preview, allowing developers to test and integrate the model ahead of broader deployment tied to future Android devices.
Gemma 4 is positioned as the foundation for the next iteration of Gemini Nano, extending capabilities for running AI workloads directly on smartphones. According to Google, the model is optimized for performance and efficiency, delivering up to four times faster processing speeds while reducing battery consumption by as much as 60% compared to previous versions.
Within Android Studio, Gemma 4 enables local AI code assistance without requiring external APIs or cloud connectivity. The model supports agentic workflows, allowing developers to execute multi-step tasks such as generating application features, refactoring codebases, and resolving build errors through iterative commands. Because inference runs locally, source code remains on the developer’s machine, addressing data privacy and security requirements in enterprise environments.
On the device side, Gemma 4 integrates with the ML Kit GenAI Prompt API, enabling developers to build applications that leverage on-device reasoning, multimodal processing, and contextual understanding. The model supports over 140 languages and can process text, images, and audio inputs, expanding potential use cases across consumer and enterprise applications.
Google is offering multiple model configurations to balance performance and efficiency. The E4B variant is designed for more complex reasoning tasks, while the E2B model prioritizes lower latency and faster execution. Both are available through the Developer Preview for testing on AICore-enabled devices, with broader support expected as new hardware launches later this year.
The introduction of Gemma 4 also reflects a broader push toward “local-first” AI architectures, where models operate directly on user devices rather than relying on cloud-based inference. This approach is intended to reduce latency, lower operational costs, and provide developers with greater control over data handling.
As part of the rollout timeline, Google plans to expand tooling and capabilities during the preview phase, including support for structured outputs, tool-calling, and enhanced prompt management. The company also indicated that future benchmarking tools will incorporate Gemma 4 to help developers evaluate performance trade-offs across different model configurations.



Comments
Loading…