Verified Is Not Safe: The Proof Boundary of Formal Verification in DeFi

The pillar article in this cluster argued that formal verification is returning to DeFi because audit logic cannot defend system-boundary invariants on its own. That argument is the precondition for this one — but not its substance. This article is about what a formal proof in DeFi actually means, where its scope ends, and why the most expensive verification mistakes in production are not proof failures but specification and environment-model failures dressed up as proofs.

The risk now is not that protocol teams under-invest in formal verification. It is that they over-interpret it. The word "verified" carries more weight in marketing decks than it does in proofs. Formal verification is powerful precisely because it is narrow. Removing the narrowness — by pretending the proof boundary extends to wherever the marketing language extends — is how serious teams ship false confidence into production.

Core claim

A formal proof in DeFi is a four-place relation: implementation, specification, environment model, verifier semantics.
Formal verification establishes specific, valuable properties — local invariants, accounting invariants, transition invariants, upgrade equivalence, and selected compositional and temporal properties.
It does not establish real-world safety. Specification quality, environment modeling, semantic completeness, off-chain dependencies, governance reliability, economic conditions, and liveness all sit outside the typical proof boundary.
"Verified" without an inspectable, version-tracked specification is marketing, not assurance.
The right use of formal verification is honest scoping: defend invariants it can hold, defend the rest with the tools that can.

Verification Is Necessary, Conditional, and Not the Same as Safety

Formal verification is necessary for any protocol whose safety depends on invariants that local audit logic cannot defend at scale. It is also conditional — every proof is conditional on the specification being correct, the environment model being accurate, and the verifier's semantics being faithful to the language it analyses. And it is not the same as safety: a protocol can be verified against a sound specification and still be unsafe because the specification did not cover the path the exploiter eventually used, the environment model omitted the dependency that broke, or the off-chain layer was compromised entirely. Holding these three statements in view at the same time is the only way to read a "verified" claim accurately. The rest of this article unpacks what each of them implies for protocol design.

What "Verified" Actually Means

A formal proof in DeFi is a four-place relation. It connects four things, and each of them has to be right for the proof to mean what the protocol team thinks it means.

The first is the implementation — the contract code that actually gets deployed. The second is the specification — the property the team is claiming will hold. The third is the environment model — what the world outside the proof is assumed to do, including how tokens behave, what oracles return, what governance can execute, what callers are permitted to do. The fourth is the verifier semantics — how the verification tool maps source code to the mathematical objects it reasons about, including which features of the EVM it models faithfully and which it approximates or omits.

If the implementation conforms to the specification, in the environment the model describes, under the semantics the verifier uses, the proof is sound. If any one of those four is wrong, weak, or incomplete, the proof is still a proof of something — but not necessarily of what the team thinks it is.

This is the operational reframe of "verified" that the rest of this article rests on.

The proof boundary of a verified DeFi claim. A diagram showing the four-place relation at the centre — implementation, specification, environment model, verifier semantics — surrounded by the inner ring of what a formal proof can establish (local functional properties, contract state invariants, transition invariants, upgrade equivalence, and frontier work on temporal and compositional properties). The dashed circle marks the proof boundary itself; outside it sit the surfaces that formal verification does not, on its own, address — specification quality, environment model accuracy, semantic completeness, off-chain assurance, governance reliability, economic safety, and liveness under stress. Safety is larger than the proof.

The Four Ways a Verification Claim Breaks

A verification claim can fail in exactly four places, and each failure is operationally distinct. Knowing which place a given incident lives in is the first step in reading a verified-protocol claim honestly.

Implementation mismatch. The deployed bytecode is not what was verified. Compiler version drift, build-pipeline divergence, last-minute commits that bypassed the verification step, or a deployment script that flips a flag the prover assumed off. Rare on disciplined teams. Not impossible.

Weak or incomplete specification. The property the verifier checked does not capture the property the team actually needs to defend. The proof discharges; the protocol still ships an exploit-shaped gap because the spec was looking somewhere else. This is the most common failure mode in production today.

Wrong environment model. The proof relies on assumptions about tokens, oracles, callers, callbacks, governance permissions, or external contracts that do not hold in deployment. The verifier was answering a different question than the deployment was asking.

Semantic or modeling gap in the verifier. The tool's representation of the EVM, of Solidity's edge cases, or of the contract's control flow approximates or omits behavior the implementation actually exhibits. The proof is sound in the verifier's model; the model is incomplete in a way the exploit later uses.

Three of these are not verifier bugs. They are bugs in how the team set up the verification — in the spec, in the environment, in the choice of tool for the property. The most expensive verification failures in DeFi are not proof failures. They are specification failures and environment-model failures dressed up as proofs.

What Formal Verification Establishes Well

The interior of the proof boundary is where formal verification is at its strongest, and where it does work nothing else does as cleanly. It is worth being precise about what that interior contains today, because the interior is also where the field's commercial value sits.

Local functional properties — established in production. Preconditions, postconditions, access control, parameter constraints. "Only the owner can update X." "No external caller can drain Y without first satisfying Z." "This function reverts on inputs outside the declared range." These properties live in a single function or a small cluster of functions. The verifier checks them quickly, the specification language for them is mature, and this is the bread and butter of formal methods on Solidity today.

Contract state invariants — established in production. The solvency-style properties any serious DeFi audit cares about: the sum of recorded balances equals the tracked supply, the reserve never falls below the guaranteed minimum, the accounting modules conserve what they are supposed to conserve. State invariants are where formal verification produces commercial value on production protocols. Most of the mature work on Aave-style lending markets, much of the Compound formal verification program, and the bulk of the deployed Certora specifications in DeFi target this layer.

Transition invariants — established for well-scoped modules. Properties about the validity of state movement across function calls. "Liquidation cannot worsen protocol solvency under modeled conditions." "A successful swap preserves the pool's stated price function." Harder than static invariants because they require reasoning about reachable transitions, but tractable inside the established interior when the module is structured for it.

Upgrade equivalence — emerging in mature protocol workflows. The team writes a specification of which behaviors the new version is allowed to change and which it must preserve; the verifier checks that the upgrade satisfies the equivalence. Aave's move toward equivalence checking on upgrades is the most visible production example, and it directly attacks the upgrade-boundary failures the pillar article catalogued. Used in a small number of mature protocols today, visibly converging into industry practice.

Temporal and multi-transaction properties — frontier, selectively usable. The sequenced behaviors that contemporary exploits actually inhabit. Recent research is moving the line, and the direction of travel is clear, but the tooling is younger and the specification cost is higher. Usable for selected properties on selected modules today; not yet a turnkey workflow for an average DeFi team.

Compositional and interface invariants — frontier. Safety at the boundary between two protocols rather than inside one. Composable-DeFi verification research is producing real machinery here, but most of it remains research-stage. Worth tracking; not yet production default.

The pattern across all of these is the same. Formal verification is most useful where the property is precisely stateable, the environment is reasonably constrained, and the implementation is structured enough to verify. As any of those weaken, the verification weakens with them.

What Formal Verification Does Not Prove

The exterior of the proof boundary is larger than the interior. Most of the surface area of DeFi safety lives outside what formal verification, on its own, can establish. The honest map has several distinct regions.

Specification quality is the proof

The first and most important limit is that the proof is only as good as the specification. This is not a tooling problem. It is a craft problem, and it gets worse as the verified surface grows.

Recent audit-of-audits research has documented a category of failure that is uncomfortable for the field. Public audit reports describe a property in natural language; the formal specification in the verification tool describes a different property; and the gap between the two is not always benign. Sometimes the gap is a paraphrase the auditor introduced. Sometimes it is load-bearing — an assumption the prover used to discharge the proof that never made it into the human-readable claim. When the assumption is wrong, the proof remains valid; the claim the protocol made is overstated relative to what was actually verified.

The implication is direct. A "verified" badge without an inspectable, machine-readable, version-tracked specification is not assurance. It is marketing. If the spec is not in the repo, the verification is not in the system. It is in the press release. The protocols that take formal verification seriously increasingly publish their CVL or equivalent specification files alongside the contracts. Those who treat verification as branding do not.

Environment models are integration assumptions

The second limit is that the environment model is where the protocol's integration assumptions live, and integration assumptions are unverified until they are explicit. A protocol can ship a valid proof and still ship false confidence if the model boundary is wrong.

The Compound cETH price-feed incident covered in the pillar article is the canonical example. The oracle implementation was reviewed against a specification that assumed cETH implemented an underlying() method. Nothing in any audit of the oracle alone disproved that assumption — there was no claim to disprove, because the assumption was implicit. The cETH contract had no obligation to comply with an interface it never published. When the upgrade shipped, the integration failed at exactly the seam the environment model had treated as background.

Formal verification can address this — but only if the integration assumption is moved from background to specification. The team has to write down what the external token, oracle, bridge, or callee is required to do, and then either verify the assumption against the dependency or check it at runtime. Many production verification programs now do this for known critical dependencies. None do it for all dependencies, because the modeling cost is real and the set of dependencies is open.

Semantic approximations exclude what the verifier does not model

The third limit is the verifier's own model of the language. Some verification tools model features of Solidity — delegatecall, low-level memory and storage operations, gas semantics, certain arithmetic edge cases — approximately, conservatively, or not at all. Documented unsoundness in specific configurations of widely used tools is not a secret; it is part of the published record. There is also a class of "strategic" properties — roughly, properties involving adversarial optimization across a complex state space — that current tools struggle to express, let alone discharge.

Language design also affects verification economics. Solidity was not designed with formal verification as a first-class concern; Move was. That is a real distinction in semantic tractability, not an argument for migration: most production value sits on Solidity, and the tooling continues to extend the verifiable surface there. But the cost of verifying a given property on Solidity is partly a function of language semantics, and that cost is part of why the proof boundary in EVM-land sits where it does.

Off-chain surfaces are outside the proof boundary by definition

The fourth limit is the one the pillar article handled in the case of the KyberSwap frontend exploit. A formal proof of every contract in a protocol does not say anything about the project's Cloudflare configuration, the operator workstation that signs the deployment, the RPC layer, the wallet UI that constructs and presents the approval, or the signers who approve the multisig transaction. These are not edge cases. They are the part of the trusted computing base that DeFi has, by convenience, mostly ignored.

Formal verification can extend partway into this region — there is research on verifying signing pipelines, certain off-chain components, and parts of bridge designs — but the practical reality on production protocols is that off-chain dependencies remain a different assurance problem with different tools. Treating "verified contracts" as equivalent to "safe protocol" elides this distinction in exactly the place where users experience the system.

Governance and operations are sociotechnical, not deductive

The fifth limit is that a verified protocol today is not a verified protocol tomorrow if the protocol is governed and changes. The pillar article covered this from the upgrade-boundary side. The limits view adds the operational dimension. A governed protocol has executors, multisig delays, emergency controls, parameter rotation, and a queue of proposals that interact with each other. Formal verification can verify that a specific proposal preserves a specific equivalence, that a parameter range respects a specific invariant, that an upgrade path produces a stated delta. It cannot verify that the human and operational layer surrounding the proposal stream will deliver those changes correctly under load.

This is not a flaw of formal methods. It is a category boundary. Governance is a delivery process. Delivery processes are validated by audits, runbooks, simulations, drills, and post-incident reviews, in addition to whatever invariants the underlying code preserves. The protocols that take this seriously fund continuous verification on the code side and operational discipline on the delivery side, and treat them as separate functions.

Economic conditions are not deductive at all

The sixth limit is the cleanest. Code correctness is a deductive property. Market behaviour is not. The interaction between a correctly implemented liquidation engine and a market under sudden volatility is partly a function of the code and partly a function of liquidity, depth, MEV competition, and bot behaviour. Formal verification can address the code half. It cannot, on its own, answer "will users be able to exit under stress?" or "is this strategy unprofitable for an adversary under realistic conditions?"

Compound's own documentation has, for years, drawn this distinction explicitly: formal verification on the contract side, simulation-based economic security on the market side. Research has formalized parts of the economic side — Clockwork Finance modelling extractable value in composed DeFi systems, lending-pool analyses using reachability and statistical model checking to search for parameter configurations that minimize unrecoverable loans — but the dominant tools for economic security are stochastic and adversarial-simulation-based, not deductive. Treating a code proof as an economic safety proof confuses two different categories of assurance.

Safety and liveness are not the same property

The seventh limit is the one most often dropped from the conversation. Most of the formal verification work in production DeFi targets safety properties — properties of the form "nothing bad happens." The complementary class is liveness — properties of the form "something good eventually happens." Liveness is what users actually care about in adversarial conditions: can I exit my position before insolvency? Will my liquidation actually execute? Can the system make progress when keepers are under attack? These are harder to verify, often dependent on external actors and timing assumptions, and frequently absent from production verification programs.

A protocol can be verified safe and still fail users on liveness. The two properties are independent. A complete assurance posture has to address both, and the second one is mostly handled by mechanisms — keepers, circuit breakers, emergency exits, bond posting, escape hatches — rather than by deductive proof.

The Specification Problem at Scale

The pattern beneath every category of limit is the same. The interior of the proof boundary is large and growing. The exterior is also large, and most of it is reachable only through specification work that no tool can do automatically.

The specification is the contract between the protocol team and the future. It says: this is what we promise will remain true, here are the assumptions under which it holds, here is the model in which we check it. The proof is the receipt. The receipt is valuable. The contract is more valuable, because the contract is what gets read when the next incident happens and someone needs to know whether the failure was inside or outside what was verified.

The protocols that have learned this make their specifications first-class artifacts — versioned, reviewed, published alongside the code, updated when the protocol changes. The ones that have not learned it treat the badge as the artifact and the specification as a back-office detail. The market sees the badge and rarely sees the object the badge was supposed to summarize. The first category has a chance of using formal verification well at scale. The second category is exactly where the next overclaim lives.

How AI is starting to compress the labor cost of producing those specifications — and which failure modes AI-assisted verification introduces in the gap between assisted formalization and replaced expertise — is the subject of the third piece in this series.

How to Read a "Verified" Claim

The discipline that follows from this map is short, and worth stating directly for the protocol teams and capital allocators who have to act on verification claims rather than admire them.

Publish the specification, not just the claim. The artifact that travels with the protocol is the property file the prover discharged, version-tracked alongside the code, readable by reviewers who were not part of the original engagement. The badge is downstream of the spec; if the spec is not visible, the badge is not either.

State the environment assumptions explicitly. Which tokens are assumed to behave like ERC-20? Which oracles are assumed to deliver freshness within a stated window? Which governance actions are in scope? Which callers are trusted? An assurance claim that does not surface its environment model is a claim whose model can quietly change without anyone noticing — usually around the next integration the protocol takes on.

Separate code proofs from economic claims. Code can be proven; markets cannot, in the same sense. Formal verification is the right answer to "does this implementation conform to the property I wrote?" Simulation-based stress testing, adversarial bot programs, parameter searches, and statistical model checking are the right answer to "is this protocol economically robust under stress?" A verified protocol is not an economically safe one, and any assurance posture that conflates the two is overclaiming.

Treat off-chain dependencies, governance reliability, and liveness mechanisms as their own assurance layers. Each has its own tooling, its own incident profile, and its own failure mode. Bundling them under "we got audited and verified" is what produces the gap between assurance claim and user experience that the industry has paid for many times.

Do not let the badge outrun the model. If the verified surface shrinks because the protocol expanded faster than the specification did, the badge should shrink with it. The protocols whose verification programs work treat the spec as a living object that engineering, security, and governance functions all have a stake in keeping current. The ones that do not are running the same risk as a code audit treated as a one-off ceremony — except now with a more confident label attached.

Verified Is Not Safe

The honest summary of the proof boundary is short. A verified property in DeFi means: under the assumptions encoded in the environment model, the implementation satisfies the stated specification, in the semantic interpretation the verifier uses. That is a real and useful statement. It is also a statement with sharp edges.

The safety of a DeFi protocol depends on more than the implementation. It depends on the quality of the specification, the accuracy of the environment model, the completeness of the verifier's semantics, the discipline of the off-chain infrastructure, the operational reliability of governance, the economic conditions of the markets the protocol runs in, and the liveness mechanisms that let users actually transact under stress. Formal verification addresses one column of that picture. Treating it as the whole picture is how serious teams ship overconfidence into production.

The right way to use formal verification is to know what it covers, write the specification well enough that the coverage is inspectable, and defend the uncovered surface with the tools that fit each region. Simulation-based stress testing and adversarial bot programs for the economic side. Runtime guards and circuit-breaker mechanisms for liveness under stress. Equivalence checking on upgrades for the governance and migration side. Operator discipline, signing-pipeline hardening, and key-management hygiene for the off-chain layer. Continuous monitoring and incident playbooks for everything else.

Each tool addresses one region of the boundary. Formal verification is the one that handles invariants. The rest still has to be done. Protocols that understand this funnel each problem to the tool that can address it. Protocols that do not end up with a verified core surrounded by an unverified system, and discover at the worst possible moment that the system, not the core, is what users actually depend on.

Key takeaway

A formal proof is a four-place relation: implementation, specification, environment model, verifier semantics. All four have to hold for the proof to mean what the team thinks it means.
Formal verification establishes well: local properties, accounting invariants, transition invariants, upgrade equivalence (emerging), and selected compositional and temporal properties (frontier).
It does not establish: specification quality, environment-model accuracy, semantic completeness, off-chain assurance, governance reliability, economic safety, or liveness under stress. All of these live outside the proof boundary.
"Verified" without an inspectable specification is marketing. The artifact that travels with a verified protocol is the spec, not the badge.
The right use of formal verification is honest scoping: defend the invariants it can hold, defend the rest with the tools that fit each region of the boundary.