Decoding Bridge Exploits and Hardening with AnySwap

Cross-chain bridges promise a simple thing that turns out to be fiendishly complex: move value between blockchains that don’t trust each other. Implementation details decide whether that promise holds or unravels. Over the past few years, we’ve watched bridges lose hundreds of millions for reasons that seem shockingly mundane in hindsight, from botched signer rotations to stale light client proofs. The good news is that the patterns repeat. If you understand how bridges fail, you can design controls that blunt entire classes of attacks.

This piece unpacks how bridge exploits typically happen, what “trust” really means in a bridge, and how to harden systems in practice. I use AnySwap as a through-line because it illustrates the evolution from simple custodial models to more robust, module-based security. AnySwap, which later aligned under the Multichain brand before operational issues shuttered its service, pioneered approaches that still inform how teams think about cross-chain liquidity and message passing today. The lessons stand even if the specific product has changed or paused: watch the trust assumptions, apply layered verification, and treat operational discipline as part of the security model, not an afterthought.

How bridges actually work

A bridge coordinates three activities: locking or burning on the source chain, producing evidence that the event happened, and minting or releasing on the destination chain. The details of evidence and verification define the trust model.

I start by grouping bridges into three broad families, knowing that real systems often blend them:

    External validation. A set of off-chain watchers or signers attest to source-chain events. Threshold signatures or MPC schemes authorize minting or release on the destination chain. Custodial token bridges and liquidity routers typically live here. On-chain light clients. The destination chain runs a verifier that checks Merkle proofs against a header chain or consensus proofs. This reduces off-chain trust but adds complexity and cost. Liquidity networks. Rather than minting wrapped assets, routers on each chain maintain inventories. Users swap out of one inventory into another, with back-end rebalancing. Security hinges on operator honesty and caps on inventory rather than wrapped token correctness.

AnySwap sat at the intersection: a cross-chain router with MPC signers, liquidity pools, and a path for wrapped assets. That blend opened more degrees of freedom for design and monitoring, and it also added more surfaces to secure.

The common failure patterns

Bridge hacks cluster around a few themes. If you’ve triaged incidents, these will feel familiar.

Signer or key compromise. If a few keys can collectively approve transfers and an attacker steals or simulates them, the attacker can mint or release assets. The nuance lives in how keys are stored, rotated, and recovered, and whether approvals require chained confirmations across time or devices. MPC helps, but the ceremony around it matters more than the math.

Logic flaws in verification paths. Smart contracts that verify proofs, map token addresses, or orchestrate mints often hold single points of failure. A classic: a function that should have been restricted to a trusted module is left callable by anyone, or a AnySwap replay guard is missing, letting an old message mint new tokens.

Insecure relayers and routers. Even if contracts are correct, the infrastructure that feeds them can be abused. Unordered or duplicated messages, race conditions in fee accounting, and failure to isolate hot keys from orchestration nodes can cascade into loss events.

Token design missteps. Bridges frequently wrap tokens. If the wrapped token uses a flawed decimal configuration, or an upgrade path allows the admin to change the underlying token address without checks, a phisher can pair a small social-engineering win with a catastrophic contract-level change.

Operational debt. Monitoring gaps, stale oracles or pricing data, and unclear incident response can magnify a small hole into a large loss. The most expensive bridges I have reviewed had no single catastrophic bug. They had a chain of minor issues that aligned because alerts were missing and rotations were ad hoc.

When you map these against real incidents, the mechanics vary, but the moral repeats: define the trust boundary, then build fences at each crossing.

What “trust” really means on a bridge

Security conversations fall apart when teams talk past each other. It helps to diagram the minimum trust you must grant to move a dollar from chain A to chain B.

    If external validators attest to events, you trust them not to collude and not to lose keys. You also trust the process that chooses, rotates, and evaluates them. The number of signers is almost never the interesting part. The selection and replacement mechanisms are. If a light client verifies proofs, you trust the correctness of the proof verifier and the finality assumptions of the source chain. You also trust the upgrade process that can change verifier logic. If a liquidity network moves inventory, you trust the solvency and behavior of routers, and you accept inventory caps as part of your risk budget. Routers will owe each other across chains. Without robust dispute and settlement, the system devolves to trust in an operator.

AnySwap leaned into a hybrid approach. Liquidity pools reduced the need to mint wrapped assets for every hop, lowering the blast radius if a minting path breaks. MPC signer quorums added depth over single-signature multisigs, reducing single-key risk. Neither removes trust entirely. What changes is where you place it and how you make it observable.

The AnySwap pattern

In its active period, AnySwap provided two primary motions: swap native assets across chains using liquidity pools, and mint or redeem wrapped assets where natives could not move directly. The architecture typically included:

    MPC key shards held by a set of nodes, with a threshold required to sign release or mint events. No individual node held a complete key, and the system allowed resharing to rotate signers without reissuing public keys on-chain. Per-chain router contracts that enforced token allowlists, fee rules, and accounting for pooled liquidity. These contracts served as choke points for mapping token IDs and gatekeeping privileged functions. A relayer layer that observed events on a source chain, constructed messages or proofs, and submitted them to the destination chain for execution if policy checks passed. Operational guardrails like per-asset limits, per-transaction caps, and pausable modules.

This pattern is sane. It separates concerns, and it embraces the idea that different token pairs and chains deserve different limits. The hardening comes in how you stitch those pieces together.

Where implementations drift into danger

Several edge cases crop up in reviews of bridge systems that resemble AnySwap.

Upgradable routers without strict admin controls. If the proxy pattern isn’t locked down with a timelock and emergency veto, a compromised admin can swap logic in minutes. On a live bridge, those minutes are costly. You want two gates: a timelocked schedule for non-critical upgrades and a different, rate-limited path for hotfixes with narrowly scoped powers.

Unbounded mint paths. When wrapped assets are minted on the destination chain, the mint function should check both the message authenticity and the mint capacity for that asset. Capacity should be dynamic and tie back to observed locks on the source chain or a hard ceiling for exposure.

Signer resharing without live verification. MPC protocols allow resharing of key shards. That feature becomes a liability if you don’t verify post-reshare quorum behavior. I have seen resharing complete with one shard misconfigured, silently downgrading the effective threshold.

Fee accounting that tolerates negative balances. Bridges commonly collect fees on destination chains. If the fee module allows temporary deficits for routing convenience, an attacker can try to exploit race conditions to force a deficit into a permanent shortfall. Fee movements should be monotonic in the absence of a privileged event.

Replay and domain separation gaps. Messages should include chain IDs, router addresses, and nonces that map uniquely to call sessions. Reusing a message format across multiple chains without hard domain separation invites cross-domain replays.

These are all fixable with design discipline and tests that simulate the ugly parts of production: signer loss, bifurcated networks, delayed blocks, and odd token behavior.

Hardening the surface: controls that actually help

I keep a mental checklist for bridges in this family. None of these items is novel on its own. Together they push the likelihood and impact of failure down to tolerable levels.

    Enforce multi-layer authorization on mint and release. A valid message is necessary, not sufficient. Contracts should also confirm that the token is on an allowlist, the destination chain matches a configured pair, the per-asset daily mint budget has headroom, and the transaction nonce has not been consumed. Removing any one of those checks increases the pressure elsewhere. Cap exposure with dynamic circuit breakers. Daily and per-transaction caps, ratcheted by volatility and liquidity depth, stop bleed-outs. A cap that looks conservative on a calm day can be too high during high volatility. The breaker should adjust using inputs like realized volatility and on-chain depth, not just fixed constants. Split authorities for upgrades and pauses. The operator who schedules an upgrade should not be the same role that can pause transfers. Pausing needs high availability and quick decision paths. Upgrades should be cumbersome by design. Instrument for intent, not just outcomes. Monitor policies in real time: how close is each asset to its daily limit, how many relayer submissions were rejected by on-chain policy, which signers have been silent this epoch. These metrics catch drift before loss. Practice signer loss and resharing drills. MPC key loss is a when, not an if, in multi-year operations. Run quarterly drills that simulate a signer going offline, complete a resharing ceremony, and verify that the threshold holds and the new shares participate correctly.

Operational habits and code meet in the same place here. The better your ability to see and practice the edge cases, the less you rely on hope when they appear in production.

Token mapping and metadata, the subtle footguns

Bridge contracts often need to map source tokens to destination representations. The mapping seems straightforward until a token changes its decimals or migrates to a new contract.

A safe mapping system does a few concrete things. It stores source chain, source token address, decimals, symbol, name, destination token address as an immutable tuple once the pair is active, and it adds a new tuple for migrations rather than overwriting. A special case migrator can then swap balances from the old destination token to the new one with a public schedule and timelock. Any function that reads token metadata must avoid trusting the token contract for dynamic fields like symbol and name. This sounds fussy. It avoids a class of social-engineering problems where a token masquerades as a different asset during an upgrade, and it gives auditors a clear lineage to inspect.

AnySwap handled this through allowlists. The lesson is to make the allowlist both strict and visible, with events on every modification that external indexers can consume.

Liquidity routers and solvency

When a bridge uses pooled liquidity instead of minting, solvency becomes the core risk. Routers need enough inventory to fill user swaps but not so much that a compromise drains a large balance. Few teams pick the right equilibrium on day one. They set static ceilings, then adjust manually after market conditions change.

A better approach uses a risk budget per asset pair. If you can tolerate a certain worst-case loss, you divide that budget across routers and time. Router caps adjust based on volatility, chain congestion, recent slippage, and arbitrage latency. If the destination chain is congested and arbitrage loops slow, you lower caps automatically to reduce exposure to mispricing and failed settlements. You also incentivize routers with a fee curve that rewards resilience in stress, not just raw volume.

On the accounting side, routers should maintain auditable liabilities. A router that owes across chains after temporary imbalances needs an on-chain representation of that debt, visible to other participants and to risk monitors. Hidden side ledgers have sunk more than one network.

Governance that does not become a vulnerability

Bridges rely heavily on configuration. Token pairs, fee levels, caps, signer sets, and upgrade targets all change over time. Every configuration lever is a potential exploit path.

A durable governance design assigns different action types to different roles with different reaction times. For example, token listing requires a timelock, a review window, and a quorum among guardians who do not operate relayers. Emergency parameter changes like lowering a cap can be fast-tracked with a smaller quorum, but the effect is bounded and expires automatically after a period. Upgrades require a longer window, with on-chain diffs published and a dry-run on a shadow fork or canary deployment.

In the AnySwap ecosystem, the practical challenge was keeping operational speed without giving a single operator too much unilateral power. The best implementations I have seen split duties across teams with overlapping but not identical privileges. That creates friction at the right moments.

Testing for the failures you do not want to see

It is easy to write tests that cover the happy path: lock on chain A, prove on chain B, release. The exploits show up one level deeper, in timing and domain separation.

Base your test plan around adversarial timing. Send two releases that reference the same source event, one slightly delayed to simulate reorgs, and see whether the second is rejected reliably. Simulate an MPC signer going offline midway through a batch, and ensure that the system neither lurches into single-signer mode nor deadlocks. Inject malformed metadata for a token, such as a symbol that changes after listing, and verify that the mapping checks ignore the token’s dynamic field.

Then go broader. Run a shadow fork of your destination chain, pipe real production events through a copy of your relayer, and replay them into the fork after artificially lagging the source chain headers. Watch how quickly caps fill during bursts. If you cannot simulate the full network, at least capture distribution tails: the 99th percentile relayer lag, the slowest MPC aggregation round, the worst block congestion. The tail is where bad days live.

The social and operational side of hardening

From the outside, bridge exploits look like code problems. Inside the teams that run these systems, the moments that matter look like phone calls at 3 a.m., a stale runbook, and one person who knows the signer resharing script. Treat those human factors as part of the system.

    Rotate responsibilities. If only one engineer can run the emergency pause or decode the MPC logs, you have a hidden single point of failure. Publish an incident framework. Define severities that map to actions, who decides, and what gets paused. Run blameless postmortems. If you cannot say exactly what went wrong within 24 hours, your observability is insufficient. Expose public risk dials. Users are adults. Show the daily mint caps, per-asset utilization, signer quorum health, and router solvency metrics. Surprises breed runs.

Bridges will always balance user experience and safety. The teams that last err on safety during stress, then claw back throughput after they have measured the impact.

Context from AnySwap’s history

AnySwap’s evolution offered a few concrete lessons I still use when advising teams.

First, modularity helps. Separating routers, token mapping, and governance logic made audits easier and created natural blast radii. When a module needed a fix, it could be paused or replaced without freezing the entire system.

Second, MPC reduced obvious key theft risk but raised the bar for operational hygiene. It forced teams to think about resharing and signer churn long before a breach. The teams that scripted and rehearsed those processes recovered faster from disruptions.

Third, blended routing paths lower systemic risk. Allowing native-to-native swaps through liquidity when available, and wrapped paths only when necessary, reduced exposure to mint logic. That is a design choice more bridges should embrace, even if it complicates fee accounting.

Finally, even with those strengths, central operational links remained. In later years, dependence on a small group of operators created fragility. The architecture looked fine, but the human layer narrowed. The lesson is not about AnySwap alone. It applies to most bridges that grow fast. Decentralize responsibilities early, or you will meet your bottleneck later at the worst time.

A practical blueprint for teams building or integrating a bridge

Teams that must move assets across chains often have to choose between building a dedicated path, integrating an existing bridge, or relying on a generalized messaging layer. There is no one-size answer. You can, however, insist on a short list of non-negotiables.

    Transparent trust model. The provider should document exactly who can mint or release, how many approvals are required, and how upgrades happen. If the answers are vague, treat that as a red flag. Caps you can verify. Per-asset and per-transaction limits must be on-chain and queryable. External dashboards help, but the contract should be the source of truth. Independent recovery paths. If a signer set goes stale or a relayer pool disappears, there should be a way to rotate keys, add relayers, or pause with a separate authority. Ask to see the runbook, not just the code. Audit trails that connect. Event logs should tie a destination release back to a specific source lock with a unique ID, allowing anyone to follow the chain of custody. Public monitoring. Health metrics should not live only on a private Grafana. A provider willing to show utilization, delays, and rejection rates tends to have better internal discipline as well.

These criteria will not guarantee safety, but they filter out solutions that treat security as a marketing line.

Looking ahead without hand-waving

Newer designs push verification on-chain using succinct proofs. When a destination chain can verify a proof of execution from the source chain cheaply, you remove a chunk of off-chain trust. Still, you reintroduce risk through upgradability and finality assumptions. I like these directions, but I do not treat them as silver bullets. On volatile days, congestion and liveness degrade at the wrong times. Your circuit breakers and operations matter just as much as your cryptography.

For liquidity networks, I expect more explicit risk markets. Routers will post collateral and buy protection against shortfalls, and users will see a quoted price that includes a risk premium. That is healthier than a flat fee model where risk hides until it explodes.

And on the human side, stronger habits will separate the survivors from the speed chasers. Anyswap Publish cap utilization, rotate signers on a schedule, rehearse failure drills, and treat the unpleasant edge cases as regular work. That is how you run a bridge instead of hoping one runs itself.

A closing perspective for practitioners

If you spend time tracing bridge exploits, the theme is not that bridges are untenable. It is that the margin for sloppy thinking is thin. AnySwap’s design choices show that incremental hardening pays off: MPC over single keys, modular routers, visible allowlists, and liquidity paths that shrink the reliance on wrapped assets. Those ideas remain relevant, no matter which brand is on the front door.

Take a hard line on trust. Write it down. If a component can mint money, pretend it is a bank vault and treat its change process accordingly. Put numbers on your risk, cap it, and expect the worst day to arrive. When it does, your users will not thank you for fancy proofs. They will thank you for a system that fails small, pauses cleanly, and restarts with integrity intact.