Balancer Hack Explained

Since the first alert on the Balancer hack that occurred on Nov 3rd, our team was waiting for a post mortem or a deep-dive article explaining the issues in the BalancerV2 code base and the actual, low-level root cause of the exploit.

While lots of researchers had already published their versions (some quite close and correct, such as Blocksec or Certora, and some simply wrong), we were still confused by the obscurity of the most of them and decided to post this long and detailed writeup, explaining precisely what actually went wrong.

Curve's invariant recap

To understand Balancer’s ComposableStablePool, it helps to recap the math behind Curve’s StableSwap invariant - the design Balancer's math is built on.
If you'd like to follow the exact math, a great explanation is available at RareSkills.

For now, let's just focus on this single formula, showing how StableSwap invariant is calculated:

\[ An^n \sum x_i + D = An^nD + \frac{D^{n+1}}{n^n \prod x_i} \]

\( n \) – number of coins in the pool
\( x_i \) – balance of coin \( i \) in the pool
\( A \) – amplification coefficient
\( D \) – pool invariant (a liquidity measure)

So StableSwap is protected from any draining by the invariant:
in an ideal (without rounding and fees) situation, \(D\) remains constant.
With trading fees and rounding, \(D\) increases slightly.
But it definitely should not decrease during swaps.

Right?

One more useful formula we'll need later is the virtual_price calculation - essentially showing that during any sort of operation the virtual price can't decrease:

\[ \mathrm{virtualPrice} = \frac{D}{\mathrm{totalLPSupply}} \]

As you can see, it's tied to the \(D\) invariant as well.

How Balancer's stable pool works

Unlike Curve, where each pool contract holds its own tokens, Balancer uses a centralized Vault architecture that manages all token balances for every pool in the protocol.

source: https://docs-v2.balancer.fi/concepts/vault/#separating-token-accounting-and-pool-logic

One of the most powerful features enabled by the Vault architecture is batchSwap(), which executes all swap calculations across potentially multiple pools, accumulating net token deltas for each asset involved.
Only after all swaps are calculated the Vault performs actual token transfers based on these net deltas.

In addition, Balancer allows for negative deltas on all tokens, a feature explicitly designed for arbitrageurs, - they don't need to have a positive balance or spend any tokens in order to perform a batchSwap that would result in a reward for the imbalance identification.

That's how the attacker drained the pools without actually depositing any of their own tokens - cool, right?

Also, for you to better understand the future context, in ComposableStablePools liquidity positions are represented by Balancer Pool Tokens (BPT), which are similar to classic LP tokens.
However, there is some difference – the BPT token is included in the pool itself as token. The pool pre-mints half of the maximum possible BPT supply (\(2^{111}\) tokens) at initialization and deposits it into the Vault.

This design makes it possible to treat deposits / withdrawals identically to regular swaps: when you "join" the pool by depositing a single token, you're actually executing a swap of that token for BPT that's sitting in the Vault.

What did the exploit look like

Let's start by unpacking this tx, where the attacker drained wETH / osETH and wETH / wstETH pools. All other pools were drained in a similar fashion.

Let's put aside preparation and calculation txs and jump right into the main stage of the drain: batchSwap() with 121 swaps inside.

The batchSwap resulted in the following state change of the pool contract (p.s. - you can simulate this exact transaction on a fork using our PoC script):

Initial state:
  Pool token balances:
    WETH: 4922356564867078856521 # 4922.3 WETH
    BPT: 2596148429267421974637745197985291
    osETH: 6851581236039298760900 # 6851.5 osETH
  Pool invariant: 12170931467218127689582
  Pool rate (virtual_price): 1027334468911956918
  Attacker internal balances:
    WETH: 0
    BPT: 0
    osETH: 0
  
Executing batchSwap with 121 swaps:
  Asset Deltas:
  WETH: -4623601508853283068500
  BPT: -44154666372672521145
  osETH: -6851122954230748094260
  
Final state:
  Pool token balances:
    WETH: 298755056013795788021 # 298.8 WETH
    BPT: 2596148429267377819971372525464146
    osETH: 458281808550666640 # 0.46 osETH
  Pool invariant: 240115638684764455726
  Pool rate (virtual_price): 20189496181073356 # ~2% of the initial value
  Attacker internal balances:
    WETH: 4623601508853283068500
    BPT: 44154666372672521145
    osETH: 6851122954230748094260

As you can see, all of the deltas in this swap are negative - meaning that the attacker somehow got a ton of wETH and osETH for free and withdrawn them from the pool as if he was an arbitrager receiving the reward.

Look at the virtual_price we mentioned above - how could it fall so drastically while the math suggests it's impossible?
Well, the reason for that is the invariant somehow decreasing by 98%.

Let's figure out exactly when it went wrong - for that, we will separate the whole batchSwap() into 3 phases.

Phase 1: swaps 1-22

The underlying txs look like this:

1 swap: BPT -> wETH, amountOut = 4873132999218408001625
2 swap: BPT -> osETH, amountOut = 6783065423678905706961
3 swap: BPT -> wETH, amountOut = 48731329992184080017
4 swap: BPT -> osETH, amountOut = 67830654236789057069
...
21 swap: BPT -> osETH, amountOut = 6783
22 swap: BPT -> wETH, amountOut = 50

It's worth mentioning before we proceed that all of the individual swaps are of kind=1, meaning that the swap expects the amountOut value, telling it how much tokens you want to swap for exactly. As you'll soon find out, that's not a coincidence.

As you can see, the idea here is to basically drain the pool's balances of wETH and osETH by swapping them (virtually) for the enormous amounts of BPT.
At every step the swap amountOut is a little higher than the token balance - reducing the remaining balance by ~100 until there are just a couple WEI left.

After performing all of these swaps, we end up with the negative deltas for both wETH and osETH and equivalently high BPT delta - nothing's wrong just yet, virtual_price got a bit higher, all normal.

Phase 1 (1..22 swaps):
  state before swaps:
    pool balances:
      wETH:  4922356564867078856521
      BPT:   2596148429267421974637745197985291
      osETH: 6851581236039298760900
    pool invariant: 12170931467218127689582
    pool rate (virtual_price): 1027334468911956918
  state after swaps:
    pool balances:
      wETH:  67000
      BPT:   2596148429279270468626368919359654
      osETH: 67000
    pool invariant: 137892
    pool rate (virtual_price): 1029274992349090474
  asset deltas:
    wETH: -4922356564867078789521
    BPT: 11848493988623721374363
    osETH: -6851581236039298693900

Phase 2: swaps 23-112

Let's take a look at the next batch of transactions:

Phase 2 (23..112 swaps):
  state before swaps:
    pool balances:
      wETH:  67000
      BPT:   2596148429279270468626368919359654
      osETH: 67000
    pool invariant: 137892
    pool rate (virtual_price): 1029274992349090474
  state after swaps:
    pool balances:
      wETH:  889
      BPT:   2596148429279270468626368919359654
      osETH: 1472
    pool invariant: 2444
    pool rate (virtual_price): 18250218330832792 # here
  asset deltas:
    wETH: -66111
    BPT: 0
    osETH: -65528

Notice something?

All the magic happens in this phase: we once again get negative deltas for both wETH and osETH, but the BPT balance doesn't change!
Take a look at the virtual_price - it got much, much lower somehow.
But why?

Bear with me - there's one final phase of the batchSwap() left.

Phase 3: swaps 113-121

113 swap: wETH -> BPT, amountOut = 10000
114 swap: osETH -> BPT, amountOut = 10000000
115 swap: wETH -> BPT, amountOut = 10000000000
116 swap: osETH -> BPT, amountOut = 10000000000000
117 swap: wETH -> BPT, amountOut = 10000000000000000
118 swap: osETH -> BPT, amountOut = 10000000000000000000
119 swap: wETH -> BPT, amountOut = 10000000000000000000000
120 swap: osETH -> BPT, amountOut = 941319322493191942754
121 swap: wETH -> BPT, amountOut = 941319322493191942754

The goal of the attacker at this point is to buy back the sold BPT and get a negative delta as a result. But with the manipulation from the phase 2, the attacker spends way less wETH / osETH than normal.

As a result, we get to that final state of the pool:

Phase 3 (113..121 swaps):
  state before swaps:
    pool balances:
      wETH:  889
      BPT:   2596148429279270468626368919359654
      osETH: 1472
    pool invariant: 2444
    pool rate (virtual_price): 18250218330832792
  state after swaps:
    pool balances:
      wETH:  298755056013795788021
      BPT:   2596148429267377819971372525464146
      osETH: 458281808550666640
    pool invariant: 240115638684764455726
    pool rate (virtual_price): 20189496181073356
  asset deltas phase 3:
    wETH: 298755056013795787132
    BPT: -11892648654996393895508
    osETH: 458281808550665168
  asset deltas phase 1-3:
    WETH: -4623601508853283068500 # all deltas are negative!
    BPT: -44154666372672521145
    osETH: -6851122954230748094260

But how did this happen?

Is the issue in the Balancer's StableMath?

First thing that comes to mind - there should be something wrong with pool's math. A couple of reports from other researchers mentioned the issues in Balancer's StableMath component - is that it?

This contract is actually a modified Curve StableSwap, so we decided to check whether the math in Balancer's StableMath contract and Curve's StableSwap is equal.
More specifically:

Does StableMath._calculateInvariant() work the same way as StableSwap.get_D() ?
Does StableMath._getTokenBalanceGivenInvariantAndAllOtherBalances() work the same way as StableSwap.get_y() ?

We wrote and ran a couple of tests on various values (you can find the code here) and got the comparison results:

_calculateInvariant() results match the ones from StableSwap.get_D()
getTokenBalanceGivenInvariantAndAllOtherBalances() differs from the get_y() as it returns higher values!

Balancer’s _getTokenBalanceGivenInvariantAndAllOtherBalances() returns larger values due to a different order of operations and rounding up, whereas Curve rounds down.

However, this is not the source of the vulnerability!
Rounding up is actually better for the pool income and does not lead to such dramatic virtual_price changes.

The real root cause

Let's start by identifying the exact swap from the Phase 2 that results in the invariant decreasing when it shouldn't:

...
Swap 23
  amount out: 66982
  state before swaps:
    pool balances:
      wETH:  67000
      BPT:   2596148429279270468626368919359654
      osETH: 67000
    pool invariant: 137892
  state after swaps:
    pool balances:
      wETH:  374353
      BPT:   2596148429279270468626368919359654
      osETH: 18
    pool invariant: 138955
  
Swap 24 # this one!
  amount out: 17 
  state before swaps:
    pool balances:
      wETH:  374353
      BPT:   2596148429279270468626368919359654
      osETH: 18
    pool invariant: 138955
  state after swaps:
    pool balances:
      wETH:  999845
      BPT:   2596148429279270468626368919359654
      osETH: 1
    pool invariant: 112404 # decreased
  
Swap 25
  amount out: 891000
  state before swaps:
    pool balances:
      wETH:  999845
      BPT:   2596148429279270468626368919359654
      osETH: 1
    pool invariant: 112404
  state after swaps:
    pool balances:
      wETH:  108845
      BPT:   2596148429279270468626368919359654
      osETH: 5183
    pool invariant: 113096
...

Here it is - the swap #24 with the mysterious amountOut = 17.
So why 17? What's so special about it that it breaks the math?

The reason for it is that both StableMath._calculateInvariant() and StableMath._getTokenBalanceGivenInvariantAndAllOtherBalances() use Newton's method for the calculations.
And when working with such small balances, you need to be precise when choosing the amountOut for the method not to fail in the imbalanced pools - otherwise, the transaction will fail.

Finally, we're here

If you read up until this point, congrats - here's the real reason for the math failure in Balancer's ComposableStablePool!

The error is hidden deep in the specific way of processing rate-based tokens, such as osETH and wstETH: they require scaling, and their scalingFactor depends on the exchange rate of the token.

Take a look at the BaseGeneralPool._swapGivenOut() function:

abstract contract BaseGeneralPool is IGeneralPool, BasePool {
	...
    function _swapGivenOut(
        SwapRequest memory swapRequest,
        uint256[] memory balances,
        uint256 indexIn,
        uint256 indexOut,
        uint256[] memory scalingFactors
    ) internal virtual returns (uint256) {
        _upscaleArray(balances, scalingFactors);
        swapRequest.amount = _upscale(swapRequest.amount, scalingFactors[indexOut]); 
        // ^^ rounds down, 
        uint256 amountIn = _onSwapGivenOut(swapRequest, balances, indexIn, indexOut);

        amountIn = _downscaleUp(amountIn, scalingFactors[indexIn]);

        return _addSwapFeeAmount(amountIn);
    }
    ...
    // from ScalingHelpers.sol
    function _upscale(uint256 amount, uint256 scalingFactor) pure returns (uint256) {
	    return FixedPoint.mulDown(amount, scalingFactor); // <-- rounds down!
	}
}
```

source: https://github.com/balancer/balancer-v2-monorepo/blob/88842344fb5f44d8ed6f8f944acd3be80627df87/pkg/pool-utils/contracts/BaseGeneralPool.sol#L68

In order to calculate the amountIn based on the amountOut (swapRequest.amount in the code above), the contract first needs to upscale the amount using the _upscale() function.

And there it is: in the _upscale() function the amount is rounded down using FixedPoint.mulDown. Then, after the calculation of the amountIn is complete, the pool downscales it back to provide user with the amount that needs to be paid in.

At first, this seems quite intuitive: rounding the amountOut down is necessary to avoid situations where users gets more than they should.
However, that's wrong.

In reality, this behavior results in the amountOut staying the same as specified by the user, but the amountIn being lower than it should.

By performing the swap repeatedly, we accumulate the loss by systematically requesting less from the user, lowering the invariant and decreasing the BPT value, and then swapping osETH and wETH back for BPT thousand times cheaper than we should've.

Proof

In order to prove that this is the true reason, we simulated the swap #24 locally using the original contract code (with upscale and rounding down) and a fixed version.

Here's a simple fix that would prevent this attack from happening:

abstract contract BaseGeneralPool is IGeneralPool, BasePool {
	...
	
    function _swapGivenOut(
        SwapRequest memory swapRequest,
        uint256[] memory balances,
        uint256 indexIn,
        uint256 indexOut,
        uint256[] memory scalingFactors
    ) internal virtual returns (uint256) {
        _upscaleArray(balances, scalingFactors);
-       swapRequest.amount = _upscale(swapRequest.amount, scalingFactors[indexOut]);
+       swapRequest.amount = _upscaleUp(swapRequest.amount, scalingFactors[indexOut]);

        uint256 amountIn = _onSwapGivenOut(swapRequest, balances, indexIn, indexOut);

        amountIn = _downscaleUp(amountIn, scalingFactors[indexIn]);

        return _addSwapFeeAmount(amountIn);
    }
    
    ...
    
    // add to ScalingHelpers.sol
+    function _upscaleUp(uint256 amount, uint256 scalingFactor) pure returns (uint256) {
+	    return FixedPoint.mulUp(amount, scalingFactor); // <-- rounds up
+	}

}

Now take a look at the simulation here.
As you will see below, the attack was possible using the original Balancer's code but not using the fixed version.

Simulation result before the fix:

Swap24
before swap:
  Token 0: 374353
  Token 2: 18
  invariant: 136874
balances after swap:
  Token 0: 999845
  Token 2: 1
  invariant: 112405 # decreases

Now with the fix:

Swap24
  before swap:
    Token 0: 374353
    Token 2: 18
    invariant: 136874
  balances after swap:
    Token 0: 1414765
    Token 2: 1
    invariant: 142295 # increases

Hooray! The fixed version swap is safe.

Conclusion

This is it. Hope you enjoyed the storytelling and now have a clear picture of the attack cause.
Kinda shocking that such small detail resulted in such a tremendous loss of funds, right?

This is why it's extremely important to review the critical invariants' behavior during the audit: analyze the math in detail, use fuzzing tests and formal verification.

At Unvariant, we take a careful approach when it comes to math — we analyze everything in detail and back our conclusions with solid test cases.
Want to make your code safer and sleep better at night?

Get a quote via Telegram or email.