Chapter 18: Making Proofs Zero-Knowledge

A conventional proof convinces by exposing its internals. The verifier sees intermediate values, checks each step, and traces the chain of reasoning from hypothesis to conclusion. Conviction comes from transparency: every piece of the argument is laid bare for inspection.

Zero-knowledge proofs must convince without this transparency. The verifier still receives messages, checks relationships, and follows a protocol. But the values she sees are randomized so that they carry no information about the witness. She inspects a full transcript and becomes convinced the statement is true, yet the transcript could have come from any valid witness, or from no witness at all (a simulator). The challenge is preserving the structure that makes verification work while destroying the information that would make the witness recoverable.

This requires care.

Most proof systems were not designed with privacy in mind. The interactive proofs of the 1980s and 1990s were built to make verification cheap: a weak verifier checking claims from a powerful prover. The sum-check protocol, GKR, and the algebraic machinery underlying modern SNARKs all emerged from complexity theory, where the goal was efficient verification, not confidential computation. Privacy became necessary only later, as these tools migrated from theory to practice and applications like blockchain transactions, private credentials, and anonymous voting demanded that proofs reveal nothing beyond validity. The result is a retrofit problem: taking elegant machinery built for transparency and making it opaque.

We saw zero-knowledge informally in -protocols (Chapter 16), then formally in Chapter 17. We have also seen it applied in specific systems: the random scalars in Groth16 (Chapter 12), the blinding polynomials in PLONK (Chapter 13). Strip these additions and the proof systems still work, they are still sound and succinct, but they leak witness information. This chapter develops the general theory behind those additions. How do we take a working proof system and add the layer that makes it reveal nothing?

The chapter develops two general techniques, then shows how specific proof systems apply them.

Commit-and-prove works for any protocol: hide every witness-dependent value behind a commitment, then prove the required relations via -protocols. This is general but expensive, with cost proportional to the number of multiplications.

Masking polynomials applies specifically to protocols where the prover sends polynomials (notably sum-check): add random noise that preserves validity while hiding the witness. This is efficient but requires algebraic structure.

Neither technique is used in isolation by production systems. Groth16 and PLONK each implement their own variants, tailored to their algebraic structure. After developing the general theory, we examine how these systems achieve zero-knowledge in practice.

The Leakage Problem

Let's be concrete about what leaks. Consider the sum-check protocol proving:

When encodes private witness values, the verifier should not learn beyond what is necessary for verification. In a proper ZK protocol, the verifier would only learn at a single random point at the end (via a commitment opening), not the polynomial itself. But sum-check requires the prover to send intermediate polynomials.

In round , the prover sends a univariate polynomial representing the partial sum with variable free:

This polynomial depends on . Its coefficients encode information about the witness.

To see this concretely, suppose encodes a computation with secret witness values :

The verifier does not know this polynomial; they only know they are verifying a sum. The first round polynomial is:

The prover sends this polynomial to the verifier. The constant term is exactly . The coefficient of is . The verifier learns linear combinations of the secrets directly from the protocol message.

Consider what these witness values could represent. Suppose you are proving eligibility for a loan without revealing your finances. Your witness might encode: = your salary, = your social security number, = your total debt. The computation verifies that your debt-to-income ratio meets some threshold. From that single round polynomial, the verifier learns your SSN directly (the constant term) and a linear combination of your salary and debt. They did not need to learn any of this to verify your eligibility. The protocol leaked it anyway.

This isn't zero-knowledge. We need to hide these coefficients while still allowing verification.

Technique 1: Commit-and-Prove

The commit-and-prove approach is conceptually simple: never send a value in the clear. Always send a commitment, then prove the committed values satisfy the required relations.

The Paradigm

For any public-coin protocol that sends witness-dependent values (public-coin means the verifier's messages are random and visible to both parties, which is the case for sum-check, GKR, and all Fiat-Shamir-compiled protocols):

  1. Replace values with commitments. Instead of sending , send (a Pedersen commitment with random blinding ).

  2. Prove relations in zero-knowledge. For each algebraic relation the original protocol checks (e.g., "this value equals that value," "this is the product of those two"), run a -protocol on the committed values.

The verifier never sees actual values. They see commitments (opaque group elements that reveal nothing about the committed data). The -protocols convince them the data satisfies the required structure.

Pedersen's Homomorphism as Leverage

Recall from Chapter 6 that Pedersen commitments () are perfectly hiding (the commitment reveals nothing about , even to an unbounded adversary) and additively homomorphic (). This means the verifier can check additive relations on committed values for free, without any interaction or -protocol: given , verify by checking .

Multiplication is more expensive. Checking on committed values requires a -protocol that proves the multiplicative relation without revealing the values. This takes three group elements and three field elements per multiplication gate.

Applying to Circuits

Since arithmetic circuits consist entirely of addition and multiplication gates, the cost of commit-and-prove is determined by the multiplication count : the prover commits to every wire value, addition gates are verified for free via homomorphism, and each multiplication gate requires one -protocol (~3 group elements). The proof contains group elements (one -protocol transcript per multiplication gate), and verification requires group exponentiations, each costing field multiplications for security parameter .

This is not succinct. A circuit with a million multiplications produces a proof with millions of group elements. But it achieves perfect zero-knowledge: the simulator can produce indistinguishable transcripts by simulating each -protocol independently.

Recovering Succinctness: Proof on a Proof

Commit-and-prove costs per multiplication gate, so it is impractical for large circuits. But it does not need to be applied to the original circuit. The idea is to split the work into two layers:

  1. First layer (not zero-knowledge). Run an efficient interactive proof, such as GKR (Chapter 7), on the original circuit . GKR is sound and succinct: the verifier's work is polylogarithmic in . The protocol produces a transcript consisting of prover messages, verifier challenges, and a final evaluation claim. This transcript is not zero-knowledge; the prover's messages leak witness information.

  2. Second layer (zero-knowledge). The GKR verifier is itself a computation: given , check consistency and output accept or reject. Express this verification as a small circuit of size . Now apply commit-and-prove to : the prover commits to all transcript values (which include the witness-derived quantities), then proves via -protocols that these commitments would make accept.

This is the "proof on a proof": the first layer proves correctness (via GKR), the second layer proves that the first layer's transcript is valid without revealing it (via commit-and-prove on the small verifier circuit). The cost of the second layer depends on 's multiplication count, which is polylogarithmic in , not on itself.

The key detail is the structure of . Recall from Chapter 7 that GKR verification consists mostly of sum-check consistency checks (pure additions, which Pedersen homomorphism handles for free). The only multiplications arise at layer boundaries, where the verifier checks an equation involving the product of two sub-circuit evaluations: one multiplication per layer. A circuit of depth thus requires only -protocols in the second layer, not the that direct commit-and-prove on the original circuit would require.

The verifier sees the public inputs and outputs (part of the statement), Pedersen commitments to all transcript values, and -protocol proofs that the committed values satisfy GKR verification. The witness is still encoded in the transcript coefficients (the chain is ), but the commitments are perfectly hiding. The -protocols prove only structural facts about these coefficients (that they satisfy certain arithmetic relations), never semantic facts (what they represent in the original computation). Every valid witness producing the same output yields commitments with the same distribution, so the verifier cannot distinguish which was used.

Technique 2: Masking Polynomials

Commit-and-prove hides values behind commitments and proves relations one at a time. This is general but expensive: the cost scales with the number of multiplications. For polynomial-based protocols like sum-check, a lighter approach exists: instead of hiding each value individually, randomize the polynomial itself so that the values the verifier sees carry no information about the witness.

The Core Idea

Whenever a protocol requires the prover to send a polynomial derived from the witness (as sum-check does with its round polynomials), the prover can mask it. Instead of sending directly, the prover sends:

where is a random polynomial (committed in advance) and is a random scalar from the verifier.

Since is random and is chosen after the commitment, acts like a one-time pad: the verifier sees but cannot extract without knowing .

The natural concern is soundness. The original protocol verified ; now the verifier sees instead. The masked sum is:

where is sent alongside the commitment to . The verifier checks . For a false claim , this requires , which implies . Masking is a soundness-preserving transformation: it changes the representation but not the truth value.

Constructing the Masking Polynomial

The masking polynomial must have the same degree structure as (otherwise would fail the verifier's degree checks), known aggregate sum (so the verifier can adjust the check), and genuinely random coefficients (so the masking actually hides ).

Protocol flow:

  1. Before the main protocol, the prover commits to a random masking polynomial and sends its sum .
  2. The verifier sends a random .
  3. The prover runs sum-check on , sending masked round polynomials.
  4. The verifier checks that round polynomials sum correctly to (the adjusted claim).

The verifier sees and knows with , but only has a commitment to , not itself. For any polynomial the verifier might guess for , there exists a consistent with the observed . This is the polynomial one-time pad: the random masking makes all witness polynomials equally plausible. In the multivariate case, the prover commits to a masking polynomial with the same structure as , and each sum-check round polynomial derived from is masked.

Masking the Final Evaluation

The round polynomials are now hidden, but there is a remaining leak. At the end of sum-check, the prover must open at the random evaluation point (typically via a polynomial commitment). This final evaluation is a deterministic function of the witness and reveals information about it.

To handle this, the prover adds random terms that vanish on the Boolean hypercube but randomize evaluations outside it. Instead of committing to directly, the prover commits to a randomized version:

where are random field elements chosen by the prover and never revealed. Since when , we have on the Boolean hypercube: correctness is preserved, and the sum is unchanged. But at a random point (where the verifier queries after sum-check), the evaluation becomes . The verifier sees via the commitment opening but does not know the , so it cannot recover .

Worked example. Let (a single-variable polynomial encoding witness data). Randomize with :

On the hypercube: and . At a random point : (would leak), but (masked). Different values produce different evaluations at , hiding .

For an -variable polynomial, the random scalars provide enough entropy: the verifier learns one evaluation , which for uniform is uniformly distributed over regardless of . A simulator who does not know can produce identically distributed evaluations by choosing random .

The Simulator

Chapter 17 defined zero-knowledge via simulation: a proof system is ZK if an efficient simulator, given only the public statement, can produce transcripts indistinguishable from real executions. Chapter 17 also observed that vanilla sum-check fails this test: the round polynomials are deterministic functions of the witness, and no simulator can produce them without it. Masking repairs this. To close the loop, we construct the simulator explicitly and verify indistinguishability.

Real protocol:

  1. Prover commits to random masking polynomial
  2. Verifier sends random
  3. Parties execute sum-check on
  4. Prover opens and at random point

Simulator (no access to the witness, and therefore no access to ):

  1. Commit to random polynomial
  2. Choose a random polynomial of the same degree as
  3. Execute sum-check on
  4. Open and

The simulator replaces with a random . The question is whether the verifier can tell the difference.

The verifier's view in both cases consists of: a commitment to , sum-check round messages derived from or , the scalar , and the final evaluations. The commitment to is a random group element in both cases (Pedersen hiding). The round messages are derived from (real) or (simulated). Since is uniformly random and independent of , and is chosen after is committed, the polynomial is uniformly distributed among polynomials of its degree. Adding a uniform polynomial to any fixed produces a uniform result, just as adding a uniform field element to any fixed value produces a uniform field element. The distribution of depends only on the randomness in and , not on . The same holds for . The two distributions are identical.

For the final evaluation: the verifier learns and (real) or and (simulated). Since is uniformly random, is uniform over . The masking of the final evaluation (via the terms from the previous section) ensures that is also uniformly distributed from the verifier's perspective. Both views are identically distributed, completing the simulation argument.

Comparing the General Techniques

AspectCommit-and-ProveMasking Polynomials
GeneralityWorks for any public-coin protocolSpecialized for polynomial protocols
Overhead -protocols for multiplications additional commitments
SuccinctnessRequires "proof on a proof"Naturally preserves succinctness
Post-quantumNo (relies on discrete log)Yes (with hash-based PCS)
ComplexityConceptually straightforwardRequires algebraic design

The cost difference reflects a difference in abstraction level. Commit-and-prove works on scalars: each field element gets its own commitment, and relations are proved one at a time. Masking polynomials works on functions: a single random polynomial masks all coefficients at once. Hiding scalars requires commitments; hiding an -coefficient polynomial requires one random polynomial. The jump from scalar to function is what makes masking efficient for polynomial-based protocols.

Most production systems use masking for the main protocol body and commit-and-prove for auxiliary statements (range proofs, committed value equality, etc.).

A third approach, developed in the HyperNova paper (Kothapalli, Setty, Tzialla, 2023), sidesteps both techniques entirely. Instead of masking round polynomials or wrapping each check in a -protocol, the prover replaces every sum-check message with a Pedersen commitment and then proves, via Nova folding, that the committed values satisfy the verifier's checks. The folding step acts as an algebraic one-time pad: the real witness is combined with a random satisfying instance, producing a folded witness that is uniformly distributed regardless of the original. The cost is roughly 3 KB of additional proof data and negligible prover overhead. This technique, called BlindFold, is what made production zkVMs (notably Jolt) genuinely zero-knowledge. Chapter 22 develops it in full after introducing the folding machinery it depends on.

Zero-Knowledge in Practice

The general techniques above provide the conceptual foundation, but production systems do not apply them directly. Groth16 and PLONK each exploit their own algebraic structure to achieve zero-knowledge more efficiently. The underlying principle is the same (randomize what the verifier sees while preserving what the verifier checks), but the mechanisms are system-specific.

Groth16

Recall from Chapter 12 that the prover constructs QAP polynomials encoding the witness, evaluates them at the secret point from the structured reference string, and packages the results as three group elements . The polynomials never leave the prover; the verifier sees only these group elements. Without additional randomization, however, the proof elements are deterministic functions of the witness: same witness, same proof. An observer comparing two proofs could detect whether they use the same witness.

Groth16 addresses this the same way Pedersen commitments hide a message: by adding randomness in the exponent. The prover samples fresh scalars and incorporates them into the proof elements, making uniformly distributed while preserving the pairing verification equation.

The Blinding Mechanism

Concretely, the prover samples and incorporates them:

The and terms add randomness. But where do they go? They'd break the verification equation unless compensated. The construction of absorbs them:

The terms , , , , and in exactly cancel the cross-terms that appear when expanding .

Why This Works

The verification equation checks:

Expanding with blinding:

The exponent becomes , which expands to include cross-terms: , , , , .

The construction is designed so that when paired with , it produces exactly these cross-terms (plus the core QAP check). Everything cancels except the QAP identity .

Different produce different valid proofs for the same witness, making proofs for distinct witnesses indistinguishable from proofs for the same witness with different randomness. Note that the blinding depends on : the prover computes as using the proving key, without knowing as a field element. The setup secret is required by the mechanism.

PLONK

PLONK's approach is closer to masking polynomials than to Groth16's element-level randomization. Recall from Chapter 13 that PLONK encodes constraints as polynomial identities that must hold on a multiplicative subgroup . The prover commits to witness polynomials whose values on encode the wire assignments. After Fiat-Shamir, the verifier queries these polynomials at a random point outside to check the constraints.

The separation between the constraint domain () and the query point () is what PLONK exploits for zero-knowledge. Unlike Groth16, which randomizes proof elements, PLONK randomizes the polynomials themselves before committing: add a random multiple of the vanishing polynomial , which is zero on . The committed polynomial agrees with the original where constraints are checked but is randomized where the verifier queries.

The Vanishing Polynomial Trick

Concretely, to blind a witness polynomial , add a random low-degree polynomial times :

where are random field elements.

On the constraint-check domain:

Why a polynomial, not just a scalar? The verifier queries at a random point , receiving . A single scalar would add the fixed value , which might not provide enough entropy depending on what else the verifier learns. Using ensures sufficient randomness for simulation arguments.

Blinding the Accumulator

PLONK's permutation argument uses an accumulator polynomial that tracks whether wire values are correctly copied. This polynomial also reveals structure.

The accumulator is checked at two points: and (the "shifted" evaluation). To mask both, use three random scalars:

The boundary condition and the recursive multiplicative relation are preserved on . Outside , both and are randomized.

The same blinding applies to every polynomial PLONK commits to, including the quotient polynomial (which is split into pieces for degree reasons, each blinded independently).

The Unifying Principle

Despite different mechanisms, Groth16 and PLONK follow the same pattern: find the null space of the verification procedure (transformations that don't affect the outcome) and inject randomness there. In Groth16, the null space is the set of shifts that cancel in the pairing. In PLONK, it is the space of multiples that vanish on the constraint domain. This connects directly to the simulation paradigm from Chapter 17: the simulator can produce valid-looking transcripts because it can choose randomness within this null space.

Key Takeaways

  1. Every ZK technique finds the null space of the verification procedure and injects randomness there. Transformations that don't affect the verification outcome are the prover's freedom. Soundness constrains what the prover can do; zero-knowledge exploits what the prover is free to randomize.

  2. Commit-and-prove is general but expensive. It works for any public-coin protocol by hiding values behind Pedersen commitments and proving relations via -protocols. Cost scales with multiplication count (), but the "proof on a proof" trick recovers succinctness by applying commit-and-prove to the verifier circuit instead of the original computation.

  3. Masking polynomials are efficient but specialized. For polynomial-sending protocols like sum-check, adding (a committed random polynomial scaled by a verifier challenge) acts as a one-time pad. Succinctness is preserved naturally. Final evaluations require separate treatment via terms like that vanish on the Boolean hypercube.

  4. Production systems tailor the null space to their algebraic structure.

    • Groth16: fresh scalars shift the proof elements; absorbs the cross-terms so the pairing equation is unchanged.
    • PLONK: random multiples of vanish on the constraint domain but randomize evaluations at the query point .
  5. Production systems blend approaches. Masking handles the core polynomial protocol. Commit-and-prove handles auxiliary statements (range proofs, committed value equality). BlindFold (Chapter 22) offers a third path via folding.