Introduction
At Google’s annual developer conference, Google I/O 2025, major announcements highlighted the convergence of artificial intelligence and the web platform.
For web developers, the focus has expanded beyond cloud-hosted model endpoints. The industry is seeing a shift toward on-device AI execution, allowing developers to run lightweight LLMs directly inside the client browser.
This article reviews the key web-focused AI announcements from Google I/O 2025 and explains how they will influence frontend application architecture.
1. On-Device AI in Chrome: Standardizing Built-in AI APIs
The most talked-about web update at Google I/O 2025 was the release of Chrome’s native Built-in AI API, which runs Google’s lightweight Gemini Nano model directly in the browser.
Traditionally, adding AI features (such as summarization or translations) to a web app required hosting expensive models server-side or routing client calls through secure proxy servers to protect API keys.
Chrome’s Built-in AI changes this by offloading processing to the user’s local hardware, running tasks locally and for free.
Using the Prompt API
async function askLocalAI(promptText) {
// Verify if the browser supports the built-in Gemini Nano model
const capabilities = await ai.assistant.capabilities();
if (capabilities.available === "no") {
console.error("On-device AI is not supported in this browser environment.");
return;
}
// Spin up a new AI session thread
const session = await ai.assistant.create();
// Prompt the local model and retrieve the response
const result = await session.prompt(promptText);
// Clean up session resources
session.destroy();
return result;
}
// Usage Example
askLocalAI("Identify bugs in the following snippet and summarize corrections...")
.then(console.log);
Core Benefits
- Low Latency and Offline Support: Without network round-trips, responses resolve quickly, and applications can run even in offline or airplane modes.
- Privacy Controls: Data does not leave the user’s machine, making this approach compliant with strict privacy regulations.
- No API Overhead: Eliminates hosting costs, token usage pricing, and rate-limiting issues.
2. Gemini API Upgrades: Structured JSON Outputs
Google’s cloud-hosted Gemini API was also upgraded. The latest Gemini 1.5 Pro and Flash models now support structured JSON outputs out of the box via the Google AI Studio console.
Leveraging the JavaScript SDK
The official JS/TS SDK (@google/generative-ai) has been updated to enforce JSON output matching defined schemas.
import { GoogleGenAI } from '@google/generative-ai';
const ai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });
// Configure Gemini to output data matching a defined JSON schema
const response = await ai.models.generateContent({
model: 'gemini-1.5-flash',
contents: 'List three clothing recommendations based on a rainy day forecast.',
config: {
responseMimeType: 'application/json',
responseSchema: {
type: 'object',
properties: {
outfits: {
type: 'array',
items: { type: 'string' }
},
reasoning: { type: 'string' }
}
}
}
});
Enforcing structured output schemas ensures the AI output integrates reliably with your application code.
3. WebAssembly (Wasm) and WebGPU Accelerations
Google also announced updates to WebAssembly (Wasm) runtimes to support on-device AI workloads.
Chrome’s multi-threaded WebGPU support has been updated, allowing Wasm binaries (such as TensorFlow.js models) to compile and run on the host system’s GPU. This speeds up real-time in-browser media processing, model execution, and physics simulations.
Conclusion
Google I/O 2025 demonstrated that web browsers are shifting from simple display engines to native AI execution platforms.
Chrome’s Built-in AI APIs will shape how developers approach user privacy and application cost structures. Start experimenting with these native APIs to prepare for the next generation of web apps.
