A Breakthrough in Secure AI: Proving Gemma3

September 28, 2025

Every new generation of AI models pushes the frontier forward. Google’s Gemma 3 is no exception: its leaner architecture brings efficiency and accuracy at scale, and its smaller form factor means it can even run directly on end devices—putting unprecedented power in the hands of users.

At Lagrange, we see this shift as both an opportunity and a responsibility. As AI becomes more portable and pervasive, we must be able to prove that its outputs are correct. That’s why we’ve extended DeepProve’s zero-knowledge proofs to Gemma 3, safeguarding the frontier of AI by ensuring that even the most advanced models can be held to cryptographic standards of trust, security, and accountability.

We announced this milestone today at Google Cloud's Verifying Intelligence Event in Singapore (hosted by House of ZK and Boundless for Token2049), underscoring the importance of pairing frontier AI with verifiable cryptography. DeepProve proving Gemma 3 foreshadows what the future of AI will look like: powerful systems paired with proofs of correct execution.

Why Proving Gemma 3 Matters

Gemma 3 represents a clear departure from older GPT-style models. It relies on a smaller number of parameters used more effectively, thanks to improved attention design, efficient normalization, novel activations, and interleaved masked attentions. These upgrades make the model faster and more accurate — but also harder to prove.

If verifiable AI is going to keep pace with frontier innovation, it has to adapt as quickly as the models themselves. Proving Gemma 3 demonstrates exactly that: DeepProve is not tied to yesterday’s architectures. It can evolve with the frontier.

How We Did It: Bringing Proofs to Gemma 3

Group Query Attention (GQA)

Gemma 3 replaces traditional Multi-Head Attention with Grouped Query Attention — reducing computation by sharing Key and Value tensors across multiple Query heads. For proofs, this broke the neat symmetry of GPT2-style attention.

We rebuilt the proving logic to treat GQA as a padded version of MHA, reusing the same proof structure.
Using a Maximum Likelihood Estimation (MLE) trick, we showed that claims on padded vectors can reduce back to claims on the original heads.
The result: efficient proofs for GQA without paying a penalty for unused “padding heads.”

‍Local + Global Attention

Unlike GPT2’s always-global attention, Gemma 3 alternates between local (sliding-window) and global layers. This demanded a new proof layer:

We extracted attention masking into a standalone provable component.
This allowed us to prove alternating local/global layers without accuracy loss.
Now, DeepProve can handle flexible attention masks — a critical step as models move toward more specialized architectures.

Rotary Positional Encoding (RoPE)

RoPE replaces heavy positional matrices with lightweight rotational encodings, improving how tokens relate to each other across longer contexts.

We implemented RoPE proofs via hadamard products and additive commitments.
Unlike GPT2’s quadratic scaling, RoPE proofs scale efficiently with sequence length.

‍RMSNorm Everywhere

Gemma 3 relies heavily on Root Mean Square Normalization — six layers per attention block. This density could have bottlenecked proving.

We built optimized provable RMSNorm logic that keeps proving and verification time fast.
We also mapped older LayerNorm proofs into RMSNorm equivalents, enabling faster proving for older models and ensuring consistency across them.

GeGLU Activation

Gemma 3 swaps in Gated Linear Units (GeGLU) for better accuracy.

We implemented GeGLU proofs using existing matmul + Hadamard primitives.
This extended DeepProve’s activation toolkit with minimal implementation overhead.

Why This Changes the Landscape

Proving Gemma 3 shows that verifiable AI is not a “retrofit” for yesterday’s models. It’s a living, modular infrastructure that can evolve with the frontier of AI research.

For developers: You don’t need to choose between cutting-edge architectures and verifiability. DeepProve can support both.
For enterprises: Compliance and trust no longer lag innovation. Proofs keep pace with model upgrades.
For society: As models get smarter, we get safer. Proofs guarantee outputs are correct, reproducible, and privacy-preserving.

Looking Ahead

Gemma 3 was designed to be efficient, accurate, and forward-looking. By proving it, we’ve shown that DeepProve is just as forward-looking, able to adapt quickly, layer by layer, to whatever the frontier brings next.

The next era of AI won’t be judged only by parameters or latency. It will be judged by provability—our ability to show, in real time and at scale, that powerful systems behave as claimed. With DeepProve on Gemma 3, that future is no longer hypothetical—it’s real.

Engineering Updates

ENGINEERING

ECOSYSTEM

FOUNDATION

DOCS

BLOG