Ethereum went ~30 minutes today without finalizing blocks. How bad is this? Was Ethereum down? Well, not really. The way Ethereum PoS works is that it favors liveness over safety, this means that instead of halting the chain, which would happen in BFT algorithms like in Cosmos or Solana, the chain keeps producing blocks but does not halt. We got close to 4 epoch’s, at which point an inactivity leak, something extremely rare, would have kicked in.
The inactivity leak would have started to penalize validators that had dropped off (didn’t vote). All other validators who were correctly voting would have stopped receiving rewards for voting but wouldn’t have been penalized (slashed) and proposers would still get paid. The penalty to non-voting validators would increase quadratically; the longer they went without voting, the higher their penalty becomes. The goal of this design is to essentially kick-out the non-participating validators without penalizing the participating ones, as the more these non-voting validators get slashed, the lower their stake is and lower their stake counts towards the 2/3 votes to finalize a block. The chain stays producing blocks the whole time.
What’s the worst case? Block reorgs. In BFT PoS, you don’t have reorgs because you have single slot finality (or ~32 slots/12 seconds for Solana), but if the above happened in a Cosmos chain or Solana it would have went offline. In Ethereum’s case, if the liveness leak is due to a client bug (which is what is expected to have happened) then the validators running this client would be penalized. A block reorg here would be quite rare and unlikely if just a client bug. This is a fairly elegant design and today was a good test/example of what happens in these situations.
I’ll wait until a post-mortem before commenting further as at this point it’s mostly just speculation. But this is a big deal, it is not a “minor” issue. Going 30 minutes without finalizing blocks is not ideal.