ZK Light Clients and Non-Native Field Costs in Cross-Chain Verification

The expensive part is not the Merkle proof

The hard part of cross-chain state verification is not proving that a message is included under a Merkle root. The hard part is proving that the root itself was accepted by the source chain’s consensus. If the destination chain is EVM and the source chain is Tendermint, Solana, NEAR, or another system with Ed25519 signatures, different hash functions, different state trees, and different finality semantics, the destination contract has only a few options. It can reimplement the source-chain light client directly onchain, trust an external validator committee, or verify a zero-knowledge proof that compresses the source-chain light-client transition.

ZK light clients exist for that third path. They move signature verification, validator-set updates, header verification, and application-root checks into an offchain proving program. The destination chain verifies one succinct proof and stores the resulting trusted root. For cross-chain swaps, this is the difference between “a relayer said this message exists” and “a proof shows that the source chain accepted the state containing this message.”

But ZK does not make the cost disappear. It changes where the cost is paid. The real bottleneck is often non-native field arithmetic. Ed25519, BLS12-381, BN254, Goldilocks, BabyBear, Keccak, Poseidon, Tendermint validator hashes, and Solana’s account model do not live in one friendly algebraic universe. When a proving system’s base field does not match the source chain’s cryptography, every foreign-field operation becomes limbs, range checks, carries, modular reduction, and many more constraints than the original operation suggests.

For an AllSwap-style routing layer, this matters directly. Proof strength affects quote quality, refund attribution, finality risk, destination execution timing, and whether a route can be used for large swaps without hiding trust assumptions behind a fast user interface.

System model: the destination chain accepts a compressed consensus fact

The model here is cross-heterogeneous-chain state verification. It is not a custodial bridge, a price oracle, or a pure multisig relay. There are four state layers:

- Source chain `S`: produces headers, validator signatures, state roots, and finality evidence. - Prover `P`: tracks source-chain headers, executes the light-client verification program, and generates a ZK proof. - Destination verifier `V`: verifies the proof and updates a trusted source-chain state on the destination chain. - Application contract `A`: uses the trusted state to verify message membership, asset release, or refund conditions.

The destination chain should not need to understand every source-chain rule. It should accept a narrowly defined statement:

`VerifyZK(proof, publicInputs) = true`

The public inputs should bind at least `oldRoot`, `newRoot`, `sourceHeight`, `validatorSetHash`, `appStateRoot`, and the relevant `messageCommitment`. If the proof is valid, the destination chain learns that the new source-chain state follows from a previously trusted state under the source chain’s consensus rules, and that the application message is committed under that state.

This still inherits the source chain’s assumptions. If the source chain finalizes a bad state because its validator threshold is compromised, the ZK proof will faithfully prove that bad consensus fact. If the prover network stops producing proofs, the destination chain does not progress. If the public inputs do not bind the chain ID, height, root, message path, and recipient correctly, the proof may be mathematically valid while useless or dangerous for the application.

ZK light clients reduce onchain verification cost. They do not remove finality, liveness, data availability, or circuit-correctness problems.

Why non-native fields inflate constraints

Most proof systems operate over a fixed finite field `F_p`. If the circuit must verify arithmetic over another field `F_q`, with `p != q`, the foreign element must be represented as several limbs inside `F_p`. A 255-bit Ed25519 scalar, for example, may be split into 51-bit or 64-bit chunks. The circuit then proves that addition, multiplication, carry propagation, and reduction modulo `q` were all done correctly.

A single multiplication `z = x * y mod q` can be written as:

`x = sum(x_i * 2^{w*i})`

`y = sum(y_i * 2^{w*i})`

`x * y = z + k * q`

The variables `x_i`, `y_i`, `z_i`, and `k_i` need range constraints. The product terms require carry constraints. A naive foreign-field multiplication over `m` limbs has `O(m^2)` partial products before reduction. Signature verification is far worse than one multiplication. Ed25519 verification involves scalar multiplication, point addition, hash challenges, modular arithmetic, and equality checks. Every one of those operations becomes a constraint system inside the proof.

This is why discussions about bringing IBC to Ethereum using ZK-SNARKs focus heavily on Ed25519 verification cost. Directly verifying many Tendermint signatures inside EVM is not a practical path. The zkBridge paper takes the opposite approach: execute the expensive consensus checks offchain and verify a succinct proof onchain. Electron Labs, Succinct’s SP1 Tendermint example, and ZK-IBC work follow the same broad architecture: make the light-client update a provable program rather than a Solidity implementation of foreign cryptography.

The engineering cost is not determined only by the number of signatures. It is roughly shaped by `signature count * elliptic-curve operations per signature * limb constraints per operation`, plus hash constraints, validator-weight accounting, membership proofs, and public-input binding. A validator set with 150 validators does not only create a 150-signature problem. It also creates a selection, bitmap, weight-sum, validator-set-hash, and update-transition problem.

Small design choices matter. A poor limb width, missing lookup tables for range checks, unfriendly hash functions, or excessive recursion can move proof generation from operationally acceptable to unusable.

Tendermint and IBC are the canonical stress test

Tendermint light-client verification is not “see a header and trust it.” It checks whether a new header has enough validator voting power, whether validator-set changes are valid, and whether the update is inside the trusting period. The IBC ICS-07 specification models the Tendermint client as client state plus consensus state. The CometBFT light-client verification specification emphasizes that a light client cannot trust full nodes. It must verify the commit and the validator-set relationship itself.

This is already nontrivial in a native environment. It is much harder on EVM.

The first mismatch is the signature layer. Cosmos-style systems commonly involve Ed25519 or secp256k1 signatures, while Ethereum execution historically did not provide an Ed25519 precompile. Implementing Ed25519 in Solidity or raw EVM bytecode is not acceptable for large validator sets.

The second mismatch is validator-set dynamics. A light client must prove that `validatorSet_t` can update to `validatorSet_{t+1}` under the source chain’s rules. A static threshold multisig is not enough. The proof must bind validator addresses, voting power, signatures, header commitments, and the new validator-set hash to the same transition.

The third mismatch is application-state coupling. A bridge or swap application does not only need a valid consensus header. It needs a message under the application root. The proof path has two layers: first, prove that the root was accepted by consensus; second, prove that the message exists under that root. Many bridge failures are not Merkle-proof failures. They happen because the system accepts the wrong root or trusts the wrong updater.

This makes Tendermint and IBC a good stress test for ZK light clients. The circuit must not merely prove that a commit is valid in isolation. It must prove that the commit advances from the destination chain’s already trusted state.

A ZK light-client state machine

A practical ZK light-client flow can be described as:

1. `sync_header(h)` fetches the source header, commit, validator set, next validator set, and application root at height `h`. 2. `check_trusting_period()` proves the old trusted state can still be updated safely. 3. `verify_commit()` verifies signatures and checks that signed voting power passes the threshold. 4. `verify_validator_update()` proves that the next validator-set hash matches the header commitment. 5. `verify_message_membership()` proves that the cross-chain message exists under the application root. 6. `prove_transition()` outputs the proof and public inputs. 7. `update_client()` verifies the proof on the destination chain and stores the new trusted height and root.

The destination contract can keep a small state:

`trustedHeight`

`trustedHeaderHash`

`trustedValidatorSetHash`

`trustedAppRoot`

`consumedMessageRoot`

This is much lighter than a full onchain light client, but the complexity has not vanished. It has moved into the proving program and its circuit. The program must implement signatures, hashes, weight accumulation, validator-set updates, and membership checks correctly. Any under-constrained variable can turn an invalid state transition into a valid proof.

The destination verifier also needs rollback protection. If a new proof does not advance from the currently trusted state, the contract should reject it. If the system supports multiple forks, the application must define which fork can authorize asset release. For cross-chain swaps, the simplest safe rule is usually monotonic client updates plus one-time message consumption: update the trusted root, verify membership, mark the message consumed, then execute release or refund logic.

Small-field STARKs, Plonky3, and recursive compression

Small-field proof systems such as Plonky3 over BabyBear or Goldilocks-style fields are attractive because they can be fast for large traces and recursion-friendly workloads. They can make prover memory access and polynomial operations more efficient. They do not automatically make Ed25519, BLS12-381, Keccak, or Tendermint validator-set logic native.

There are three broad implementation strategies.

The first is direct circuit implementation. The circuit implements the source chain’s signatures, hashes, and header rules directly. This is the most general approach, but it has the largest constraint count and often the highest proving latency.

The second is recursion or aggregation. Several signatures, headers, and message proofs are first proven or aggregated offchain, then compressed into one final proof. zkBridge’s design explores proof-system composition and distributed proving to make this practical. The trade-off is operational complexity. When a proof fails, the team must identify whether the fault is in the source witness, the first-layer circuit, the recursive verifier, the aggregation layer, or the final verifier contract.

The third is relying on target-chain precompiles where available. Ethereum’s Pectra upgrade includes EIP-2537, adding BLS12-381 curve-operation precompiles. That is important for BLS signatures and some proof-verification paths. It does not directly solve Ed25519 verification, Tendermint validator-set updates, Solana PoH commitments, or non-EVM state trees inside a ZK circuit. A precompile can make a specific target-chain operation cheaper; it does not make heterogeneous chains algebraically identical.

Recursive compression is also not free. If one proof system is good for Ed25519 verification and another is good for cheap onchain verification, the second layer must prove the first verifier ran correctly. That creates a new circuit and new audit surface. Engineers are trading among prover latency, destination-chain gas, and auditability.

Failure modes: a valid proof can still be unsafe

The first failure mode is incorrect public-input binding. The circuit may verify a valid header, but the verifier contract may fail to bind `sourceChainId`, `height`, `appRoot`, `messagePath`, `recipient`, and `asset` into the same domain. The proof is valid, but the application meaning is wrong. In a swap route, that can become an incorrect release or refund.

The second failure mode is stale proof acceptance. Tendermint light clients and IBC clients rely on trusting periods. If a relayer or prover submits a proof after the safe update window, the proof may still verify an old transition, but the client security assumption no longer holds. A ZK implementation must expose the same timing rules onchain.

The third failure mode is prover centralization. ZK light clients reduce verification cost, but proof generation may concentrate among a few high-performance provers. If all cross-chain updates rely on one prover queue, liveness becomes “is that prover online?” This may not break safety, but it breaks arrival time, refund timing, and route-level service guarantees.

The fourth failure mode is under-constrained circuits. Signature parsing, range checks, carry constraints, hash domain separation, validator-weight overflow, and message-path encoding are all possible weak points. The chain verifier contract can be small and correct while the proving program is wrong.

The fifth failure mode is finality misinterpretation. Solana, Tendermint, Ethereum, L2s, and appchains do not share one finality model. “Wait N blocks” is not a rigorous abstraction. A ZK proof only proves a statement under a specific consensus rule. Product routing must still decide which finality window is acceptable.

The sixth failure mode is proof-parameter drift. FRI queries, recursion depth, SNARK curve choices, trusted setup assumptions, verifier versions, and upgrade authority all affect the security boundary. A product label that says “ZK verified” is not enough unless the system also tracks which verifier and which proving path were used.

Why AllSwap should care

To a user, a cross-chain swap is successful when the destination asset arrives. To a router, the real question is: what is the evidence for that arrival? If the path uses a committee, the route includes committee trust. If it uses a native light client, the route includes onchain maintenance and gas cost. If it uses a ZK light client, the route includes proof latency, prover availability, circuit risk, source-chain finality, and verifier-contract gas. If it uses an optimistic proof, the route includes challenge windows and failure recovery.

AllSwap does not need every path to use a ZK light client. A practical router should treat proof type as one input to route scoring:

- Committee proof: low latency, stronger external trust. - Native light client: minimal trust, higher onchain verification and maintenance cost. - ZK light client: cheaper onchain verification, but prover and circuit risk. - Optimistic proof: low immediate cost, but challenge-window and rollback complexity.

A useful route score is not only price. It can be modeled as:

`routeScore = priceScore - latencyPenalty - trustPenalty - refundPenalty`

The `trustPenalty` should come from observable variables: verifier contract version, upgradeability, recent proof latency, prover queue length, source-chain finality window, historical failure rate, and whether refunds have onchain evidence. Large swaps and fragile refund routes should prefer stronger proof paths. Small low-risk routes may reasonably choose faster or cheaper proof assumptions.

This matters for [AllSwap fees](/fees), [/swap/usdt-erc20](/swap/usdt-erc20), and [/assets/usdc](/assets/usdc) style entry points. The user does not need to understand non-native field arithmetic. The product should still understand whether a route is committee-based, native-light-client-based, ZK-light-client-based, or optimistic, because that affects the cost of failure.

Refund attribution is especially important. If destination execution fails, the system must identify whether the failure happened at source-chain lock or burn, proof generation, proof verification, message membership, or token execution. ZK light clients can make part of that chain of evidence explicit. If the source message exists under a proven root, the refund logic can focus on destination execution failure. If no proof is generated, the user is waiting on proof liveness, not on an unknowable asset state.

The monitoring layer should track at least `latestProvedHeight`, `sourceFinalizedHeight`, `proofLag`, `proofFailureRate`, `verifierVersion`, and `pendingRefundNotional`. These values do not all need to be displayed to ordinary users. They should still influence routing risk. A route that is several hours behind the source chain should not keep receiving orders as if the proof channel were healthy.

Proof caching needs the same discipline. A cached proof should be scoped by source height, application root, message commitment, verifier version, and destination domain. Treating an old proof as a reusable passport for a new message is a protocol bug, not an optimization. For routing, cache freshness is part of liquidity quality because stale proof channels turn fast quotes into delayed settlement.

Open problems

Non-native arithmetic optimization has not converged. Limb width, lookup design, range-check strategy, carry constraints, recursive compression, and hash choice differ across proof systems. It is hard to compare costs across implementations unless teams expose proof-generation latency, constraint counts, and verifier cost under the same workload.

Heterogeneous state-tree abstraction remains unsolved. Tendermint AppHash, Ethereum MPT and future Verkle layouts, Solana accounts, and Move object models are not one proof interface. A useful ZK light client must prove both consensus acceptance and application-state membership.

Prover-market liveness is still immature. If proof generation needs GPU clusters or specialized infrastructure, who guarantees timely proof production? Who pays for delayed refunds? Who audits witness generation? Those are operational questions, not only cryptographic ones.

Upgrade handling is hard. If the source chain changes signature formats, hash functions, validator-set encoding, or application-root structure, the proving circuit must upgrade. If the destination verifier upgrades too slowly, the cross-chain client may freeze at an old height.

User-facing proof semantics are underdeveloped. Users should not need to read a circuit. They do need to know whether a route is committee-verified, native-light-client-verified, ZK-light-client-verified, or optimistic, and what refund evidence exists if the route fails.

References

[1] zkBridge: Trustless Cross-chain Bridges Made Practical, Xie et al., ACM CCS 2022, https://arxiv.org/abs/2210.00264

[2] IBC ICS-07 Tendermint Client Specification, Cosmos IBC, https://github.com/cosmos/ibc/blob/main/spec/client/ics-007-tendermint-client/README.md

[3] CometBFT Light Client Verification Specification, CometBFT, https://github.com/cometbft/cometbft-rs/blob/main/docs/spec/lightclient/verification/verification.md

[4] A Tendermint Light Client, Braithwaite et al., 2020, https://arxiv.org/abs/2010.07031

[5] Bringing IBC to Ethereum using ZK-SNARKs, Electron Labs / ethresear.ch, 2022, https://ethresear.ch/t/bringing-ibc-to-ethereum-using-zk-snarks/13634

[6] Electron Labs NEAR Prover Contracts, Electron Labs, https://github.com/Electron-Labs/near-prover-contracts

[7] SP1 Tendermint Example, Succinct Labs, https://github.com/succinctlabs/sp1-tendermint-example

[8] Prague-Electra Pectra Upgrade, ethereum.org, https://ethereum.org/roadmap/pectra/

[9] EIP-2537: Precompile for BLS12-381 curve operations, Ethereum Improvement Proposals, https://eips.ethereum.org/EIPS/eip-2537

[10] Polygon Plonky3, Polygon Labs, https://polygon.technology/blog/polygon-plonky3-the-next-generation-of-zk-proving-systems-is-production-ready

FAQ

Can ZK light clients fully replace bridge validator committees?

They can reduce committee trust by proving source-chain consensus transitions, but they still depend on source-chain finality, prover liveness, circuit correctness, and destination verifier security.

Why does non-native field arithmetic make ZK light clients expensive?

When the proof-system field differs from the source-chain cryptography, the circuit must emulate foreign-field operations with limbs, range checks, carries, and modular reduction. Ed25519 verification becomes many constraints.

Does EIP-2537 solve the main ZK light-client bottleneck?

EIP-2537 helps with BLS12-381 operations on Ethereum, but it does not directly remove Ed25519, Tendermint validator updates, Solana commitments, or non-EVM state-tree costs inside ZK circuits.

Why should AllSwap route scoring care about proof type?

Proof type affects trust assumptions, latency, refund evidence, and failure modes. Large or high-risk swaps should prefer stronger proof paths, while small routes may trade proof strength for speed and cost.

What are common ZK light-client failure modes?

Common failures include wrong public-input binding, stale proofs, prover centralization, under-constrained circuits, finality misinterpretation, and proof-parameter drift.

Sources & references

Winter.LauSenior software engineer

Started exploring crypto in 2017. Deep work on order matching, cryptocurrency, and cross-chain systems.