News

Google Launches Gemma 4 — A Frontier Open AI Model That Runs On-Device Under Apache 2.0

Google has released Gemma 4, a family of open models in 4 sizes (E2B–31B) with support for 140+ languages and full multimodal capabilities across text, image, and audio. The 31B model ranks #3 globally on Arena AI, the license has shifted to full Apache 2.0, and AICore Developer Preview now brings on-device AI to Android.

4 Apr 202611 minGoogle Blog

GoogleGemmaOpen SourceAIOn-Device AIAndroid

Introduction — When Frontier AI Becomes Free and Runs on Your Phone

Imagine a top-3 AI model in the world that doesn’t need to send data to the cloud, doesn’t require monthly API fees, and doesn’t create constant anxiety around data privacy — and it runs directly on the phone in your hand.

That’s not a vision for five years from now. It’s what Google Gemma 4 is making possible in April 2026.

On April 2, 2026, Google announced Gemma 4 — its most powerful open AI model family yet, spanning 4 sizes from 2B to 31B parameters. It supports full multimodal capabilities across text, image, video, and audio, and now comes under the Apache 2.0 license, giving businesses full freedom for commercial use.

Most importantly, the smaller models are designed to run directly on Android devices through the newly available AICore Developer Preview.

For Thai enterprises, startups, and developers looking to build AI into products, this is a game changer worth paying attention to.

What Is Gemma 4? — Four Sizes for Every Use Case

Gemma 4 is built on the same core technology as Gemini 3 (Google’s flagship model), but optimized to be small enough to run on a wide range of hardware — from smartphones to servers.

The 4 Core Model Sizes

Model	Total Parameters	Active Parameters	Context Window	Type
E2B	5.1B	2.3B	128K	Dense
E4B	8B	4.5B	128K	Dense
26B	26B	4B	256K	MoE (Mixture of Experts)
31B	31B	31B	256K	Dense

One of the most interesting details is the “E” (Effective) label — it refers to the number of parameters actually activated during inference. Google designed these models with more total parameters than are used at runtime, helping improve quality without consuming too much RAM or battery.

The 26B model uses a Mixture of Experts (MoE) architecture, with 26B total parameters but only 4B activated per request — delivering performance close to the 31B model while using far fewer resources.

Why Gemma 4 Matters — The Numbers Speak for Themselves

Key Benchmark Results

The 31B Dense model in Gemma 4 ranks #3 in the world on the Arena AI text leaderboard, with an Elo score of around 1452. The 26B MoE model ranks #6 with a score of 1441, despite using only 4B active parameters.

Notable benchmark results:

Benchmark	31B	26B (MoE)	E4B	E2B
MMLU Pro (general knowledge)	85.2%	82.6%	69.4%	60.0%
AIME 2026 (math)	89.2%	88.3%	42.5%	37.5%
LiveCodeBench v6 (coding)	80.0%	77.1%	52.0%	44.0%
MMMU Pro (multimodal)	76.9%	73.8%	52.6%	44.2%

The 26B model can outperform models more than 20 times larger. This is what Google describes as breakthrough “intelligence-per-parameter.”

Full Multimodal Capability in a Single Model Family

Gemma 4 is not just a text model — it can understand multiple modalities together.

What Each Model Size Can Do

Capability	E2B	E4B	26B	31B
Text	Yes	Yes	Yes	Yes
Image (OCR, charts, photos)	Yes	Yes	Yes	Yes
Video	Yes	Yes	Yes	Yes
Audio (speech, transcription)	Yes	Yes	No	No

An important detail: the smaller models (E2B, E4B) support audio, while the larger ones do not. That’s because Google specifically designed the smaller models for on-device use cases where they can process microphone input directly.

What You Can Do Right Away

OCR — Extract text from images, receipts, and documents
Chart Understanding — Analyze graphs, charts, and dashboard visuals
GUI Detection — Recognize elements on an app screen
Audio Transcription — Convert speech to text
Video Understanding — Analyze video content
Bounding Box Detection — Identify object locations in images

And with E2B or E4B running on-device, all of this can happen without sending data to the cloud.

Support for 140+ Languages — Including Thai

Gemma 4 was trained from the ground up to support more than 140 languages, including Thai. This is not just “basic support” — it’s native multilingual training, which helps the model better understand context and meaning across languages.

For Thai businesses building AI products for Thai users, that matters a lot. You don’t need to fine-tune from scratch or force prompts into English first.

Apache 2.0 — A Major Licensing Shift

One of the most important — and potentially overlooked — changes is that Gemma 4 has moved from the custom Gemma license to full Apache 2.0.

Why This Matters

Issue	Previous Gemma License	New Apache 2.0
Commercial use	Allowed (with terms)	Allowed freely
Modification	Allowed	Allowed
Redistribution	Restricted	Fully open
Use in paid products	Must review terms	Ready to use
Compatibility with other OSS	Potential issues	Highly compatible

Apache 2.0 is one of the licenses that enterprises trust most. In many companies, legal teams can approve Apache 2.0 software immediately without a lengthy review process.

For startups and software vendors embedding AI into products, this is effectively Google saying: go ahead and ship it.

Agentic AI — Building AI Agents That Think and Act

Gemma 4 isn’t designed only for answering questions. It’s built to support AI agents that can:

Reason — Use chain-of-thought style reasoning through thinking tokens
Call tools — Support built-in function calling for external APIs
Make decisions — Evaluate tool outputs and decide what to do next
Handle workflows — Go beyond one-shot Q&A and execute multi-step tasks

Example Use Cases

Customer Service Agent — Answer customer questions, check backend systems, and respond automatically, all on-device without sending customer data externally
Field Service Assistant — A technician takes a photo of a machine, and the model analyzes it to recommend repair steps, even without internet access
Document Processing — Read invoices, receipts, and business documents and extract key information, all processed on-device to protect privacy

“Once AI agents can run on-device, use cases previously blocked by latency, privacy, and connectivity concerns suddenly become practical.”

AICore Developer Preview — Available on Android Today

Google has introduced AICore Developer Preview, allowing developers to start testing Gemma 4 directly on supported Android devices.

What You Get with AICore Developer Preview

Direct access to E2B and E4B models on supported devices
ML Kit GenAI Prompt API for easier AI feature development through standard APIs
Support for hardware accelerators from Google, MediaTek, and Qualcomm
A path toward Gemini Nano 4, which will ship on flagship Android devices later this year

Big Performance Gains

Compared with previous generations:

Up to 4x faster inference
Up to 60% lower battery usage
E2B is up to 3x faster than E4B for use cases that need quick responses

Gemma 4 and Gemini Nano 4

A key relationship to understand: Gemma 4 is the foundation of Gemini Nano 4. Code written today using Gemma 4 through ML Kit will run directly on Gemini Nano 4 on new flagship devices launching later this year.

This gives developers a chance to start building on-device AI features now, without waiting for the next hardware cycle.

A Strong Community — 400 Million Downloads

Since the first Gemma release, developers worldwide have downloaded Gemma models more than 400 million times and created over 100,000 customized variants — what Google calls the “Gemmaverse.”

Why that matters:

A large ecosystem means it’s easier to find help and solutions
Many variants means there are already fine-tuned models for specialized domains
Day-0 support from major inference engines including transformers, llama.cpp, MLX, ONNX, and more

What This Means for Thai Developers and Businesses

Gemma 4 is more than just another AI product announcement — it changes the economics and practical reality of AI in several important ways.

1. AI Costs Drop to Near Zero for Inference

When models run on-device, there are no API fees, no cloud compute charges, and no per-token pricing. For startups worried about the unit economics of AI features, this is a real solution.

2. Data Privacy Is No Longer a Barrier

For industries handling sensitive data — finance, healthcare, legal — on-device AI means the data never leaves the device. That aligns naturally with PDPA requirements and data residency concerns.

3. Offline AI Is Now Real

Construction sites, factories, and remote locations with unreliable internet can still use AI effectively, because everything runs locally on the device.

4. Thai Language Support Is Ready from Day One

There’s no need to fine-tune Thai language capability from scratch. Gemma 4 supports 140+ languages natively, reducing both time and cost for teams building AI products for the Thai market.

5. Apache 2.0 Means You Can Ship Faster

Most corporate legal teams in Thailand are already familiar with Apache 2.0. There’s no need for extended review of unusual licensing terms or hidden restrictions.

How Thai Organizations Can Get Started

Short Term (Start Immediately)

Test Gemma 4 E2B/E4B through AICore Developer Preview on Android devices
Identify use cases that need privacy, low latency, or offline capability
Assess your data — even the best model still needs high-quality input data

Mid Term (Q2–Q3 2026)

Fine-tune for specific domains using your organization’s own data
Build prototypes of on-device AI agents for the selected use cases
Prepare for Gemini Nano 4 — code written for Gemma 4 will transfer directly

Long Term (H2 2026+)

Deploy on-device AI as a core feature inside the organization’s mobile app
Create AI-first experiences — once AI lives on every device, user experience changes fundamentally

Strategic View — The Bigger Picture

Gemma 4 is part of a much larger shift in the AI industry: moving intelligence from the cloud to the edge and the device itself.

Apple Intelligence is bringing more AI on-device to iPhone
Qualcomm is pushing on-device AI through Snapdragon
Google is responding with Gemma 4 + AICore + Gemini Nano 4

What’s happening is clear: AI is becoming a foundational layer of mobile computing, just like GPS, cameras, and internet connectivity once did. Every app will have AI inside it, and increasingly, that AI will run directly on-device rather than depending on the cloud.

For Thai organizations, the question is no longer “Should we use on-device AI?” It’s now “When do we start preparing for it?”

Every month you wait is a month your competitors can spend building AI features you still don’t have.

Comparing Gemma 4 with Other Open Models

Model	Size	Multimodal	Audio	On-Device	License	LMArena Score
Gemma 4 31B	31B	Yes	No	Server	Apache 2.0	~1452
Gemma 4 E2B	2.3B active	Yes	Yes	Yes	Apache 2.0	—
Llama 3.3 70B	70B	Text only	No	No	Llama License	~1440+
Qwen 2.5 32B	32B	Yes	No	No	Apache 2.0	~1430+
Phi-4 14B	14B	Yes	No	Yes (partial)	MIT	~1380+

Gemma 4’s advantage is clear: multimodal + audio + on-device + Apache 2.0 in a single compact model family. No major competitor currently offers that full combination in smaller models.

Conclusion — What to Remember

Google Gemma 4 marks a major turning point in the open AI ecosystem:

4 sizes (E2B, E4B, 26B MoE, 31B Dense) spanning mobile to server use cases
31B ranks #3 globally on the Arena AI text leaderboard
Full multimodal support — text, image, video, and audio (in smaller models)
140+ languages, including Thai
Apache 2.0 — ready for commercial use with no special restrictions
AICore Developer Preview — available now on Android
400 million downloads and 100,000+ variants in the Gemmaverse

When frontier-grade AI is free, runs on phones, and comes without licensing friction, the excuses for not adopting AI are disappearing fast.

Ready to Build On-Device AI for Your Business?

The Enersys team has experience helping Thai organizations build AI solutions — from selecting the right model and designing the architecture to deploying production-ready systems.

Whether you’re exploring on-device AI, AI agents for customer service, or privacy-preserving document processing, we can help.

Get a free consultation with the Enersys team

References

ลิงก์ที่เกี่ยวข้อง

Genesis AI Platform

ลองใช้ AI Platform สำหรับองค์กร

AI Readiness Assessment

องค์กรคุณพร้อมสำหรับ AI แค่ไหน?

อ่านบทความเพิ่มเติม

ติดตามข่าวสาร AI และ Tech

Back to Insights

AIS x Thai SME Council Launches ProStart — Digital + AI Bundle with 200% Tax Deduction Thai SMEs Shouldn’t Miss

AIS has partnered with the Thai SME Council to launch ProStart, an AI + Digital package for SMEs with a 200% tax deduction worth up to ฿300,000 — a golden opportunity available only until the end of 2027.

Anthropic Hits $30B ARR — Enterprise AI Has Officially Crossed the Tipping Point

Anthropic tripled from $9B to $30B run rate in just four months, with 1,000+ enterprise customers each spending over $1M a year. This is no longer hype — it is proof that Enterprise AI has real ROI at scale, and Thai businesses need to decide when (not if) to move.

Asia’s AI Law Wave in 2026 — Vietnam, South Korea, China, and ASEAN Move Full Speed Ahead

Vietnam became the first ASEAN country to enforce an AI law in March 2026, South Korea followed with its AI Basic Act in January 2026, China is rolling out more than 30 AI and data standards, while Thailand is gathering public feedback on AI guidelines.

"Empowering Innovation,
Transforming Futures."

ติดต่อเราเพื่อทำให้โปรเจกต์ของคุณเป็นจริง