Google's Gemma 4 Runs On-Device on iPhone 17 Pro at 40 Tokens Per Second — With Image Understanding

A viral demo shows Google's Gemma 4 E2B model running locally on an iPhone 17 Pro with MLX optimization, achieving ~40 tokens/second and full image understanding with no cloud connection required.

Subscribe to unlock all stories

Get full access to The Singularity Ledger, archive included.

Cancel anytime. Payments powered by Stripe.