Introducing Crucible: An Invariant Fuzzing Framework for Solana

For years, a bug sat undetected in the official Solana Stake program. A specific sequence of five instructions could produce a phantom stake: delegation weight that persisted after the SOL it referenced was withdrawn. The bug has been patched, but its years-long survival points to a broader gap in our current infrastructure: neither unit tests nor integration tests are suited to the exponential space of instruction sequences, and bugs that live in that space can sit untouched for years. Finding them requires an invariant fuzzing framework that, with the aid of coverage feedback, mutates both inputs and sequences in order to systematically explore this search space.

We built Crucible to do exactly that. This post walks through the root cause, then shows Crucible rediscovering the sequence from scratch in seconds.

The Stake Program

The core Solana Stake program lets users delegate SOL to validators and earn inflation rewards proportional to the amount delegated. Stake is held in stake accounts; each tracks a Delegation which contains the amount of SOL deposited, the target validator, and the activation/deactivation epochs.

pub struct Delegation {
    pub voter_pubkey: Pubkey,
    pub stake: u64,
    pub activation_epoch: Epoch,
    pub deactivation_epoch: Epoch,
    // ...
}

The program has the following relevant instructions:

- delegate_stake — initializes the Delegation, setting delegation.stake to the account's balance and deactivation_epoch to u64::MAX (the sentinel for "active"). The delegation takes effect once the epoch advances.

- deactivate — sets deactivation_epoch to the current epoch and begins cooldown.

- withdraw — moves lamports out of the account, subject to the cooldown rules.

delegate_stake contains an interesting rescind path: If you deactivate and change your mind before the epoch ends, calling delegate_stake again in the same epoch un-cools the voter’s delegation without a wait.

The Bug

Here is the sequence mentioned in the opening, with the accounts each instruction takes:

delegate_stake(stake_account, vote_account, authority);   // X lamports delegated
advance_epoch();                                          // partial warmup (cluster warmup rate-limited)
deactivate(stake_account, authority);                     // begin cooldown
withdraw(stake_account, to_account, authority, amount);   // drain activating lamports
delegate_stake(stake_account, vote_account, authority);   // rescind — delegation.stake unchanged

The bug is an interaction between two pieces of code: process_withdraw's reserve calculation, and the rescind path inside the delegate_stake handler. Each behaves reasonably in isolation; together they violate an important security invariant.

process_withdraw decides how many lamports must remain reserved on the account:

let staked = if clock.epoch >= stake.delegation.deactivation_epoch {
    stake.delegation.stake(
        clock.epoch,
        stake_history,
        PERPETUAL_NEW_WARMUP_COOLDOWN_RATE_EPOCH,
    )
} else {
    stake.delegation.stake
};
let staked_and_reserve = checked_add(staked, meta.rent_exempt_reserve)?;

When the account is in or past its deactivation epoch, withdraw reserves the effective stake. Otherwise it reserves the full delegation.stake. The first branch operates correctly in normal use: at the moment of deactivation, a fully-warmed-up account has effective == delegation.stake, and effective then decreases over subsequent cooldown epochs as lamports legitimately become withdrawable. The interesting case is when the deactivation epoch lands on a partially-warmed-up account. Cluster-wide activation is rate-limited at 9% of total effective stake per epoch, and when that quota is saturated, an individual account's effective_stake lags behind its delegation.stake. If the holder calls deactivate in that state and immediately withdraws, the reserve calculation uses the small effective_stake, so the holder can pull everything above it, which includes the activating-but-not-yet-effective portion. By itself this is benign: the account is in cooldown, so divergence between on-chain stake and account lamports is expected while the cooldown unwinds.

The second piece is delegate_stake. When called on an already-delegated account, the handler delegates to an internal helper, redelegate_stake, which contains the relevant branch:

if stake.stake(
    epoch,
    stake_history,
    PERPETUAL_NEW_WARMUP_COOLDOWN_RATE_EPOCH,
) != 0
{
    if stake.delegation.voter_pubkey == *voter_pubkey
        && epoch == stake.delegation.deactivation_epoch
    {
        // rescind branch
        stake.delegation.deactivation_epoch = u64::MAX;
        return Ok(());
    } else {
        return Err(StakeError::TooSoonToRedelegate.into());
    }
}

The rescind branch fires when the new delegation targets the same voter and clock.epoch matches deactivation_epoch. It resets deactivation_epoch to active and leaves delegation.stake untouched. The path exists so a delegator can cancel a deactivation within the same epoch.

Walking the five-step sequence: at action 1 the user initially delegates and by action 2 the account is partially warmed up, with effective_stake some fraction of delegation.stake. Action 3 sets deactivation_epoch = clock.epoch. Action 4 withdraws while clock.epoch == deactivation_epoch, so the reserve uses the small effective_stake, and the holder drains the activating gap. Action 5 hits the rescind branch (same voter, same epoch), re-activates the stake account, and leaves delegation.stake at its pre-withdraw value. The account now carries delegation weight that no SOL backs. That gap is phantom stake: it counts toward V's consensus weight, likelihood of being selected as leader, and the epoch inflation rewards V earns. Each gapped account collects those rewards indefinitely, and the attack scales linearly across as many accounts as can be gapped during a single cluster-warmup window.

The fix landed in PR #198 (commit ff89b6b4). It deleted the redelegate_stake helper, inlined an equivalent if/else chain into the delegate_stake handler, and added a guard to the rescind branch:

if stake_amount < stake.delegation.stake {
    return Err(StakeError::InsufficientDelegation.into());
}
stake.delegation.deactivation_epoch = u64::MAX;

There is a one-line property that this bug violates:

fuzz_assert!(delegation.stake <= lamports);

An active stake account should never delegate more lamports than it holds. It's a short and intuitive invariant; what's interesting is finding the sequence that violates it.

Introducing Crucible

Finding the unknown sequence is the actual hard problem. The stake harness exposes twenty-plus actions. This bug hides behind a specific ordering of five of them. Before you factor in per-action parameters (which validator, how much to withdraw, which authority), you are searching for a specific path of length five through a graph of twenty-plus nodes. With parameters folded in, the effective space is astronomically larger, and random brute force will not converge on it.

Finding bugs in this search space requires several components working together.

Writing a fuzz harness is tedious, even with frameworks like LiteSVM or Mollusk. Most of the code is account construction, signing, instruction encoding, and result parsing. Developers are reluctant to work on it, and the harnesses that do get written tend to cover a narrow slice of the program's surface, which leads to missed bugs making it to production. Crucible's TestContext collapses the plumbing so harness authors spend their time on the program semantics they actually want to test.

A fuzzer also needs to know which inputs are bringing it closer to interesting code or interesting states. Without that signal, exploration is random, and the input space of a non-trivial program is too large for random to converge. Crucible tracks sBPF edge coverage: sequences that reach previously-unseen edges survive into the corpus. In stateful mode, state coverage acts as a secondary signal, distinguishing inputs that reach the same code through different program states. Discovery compounds by using a scheduler that targets rare inputs as opposed to baseline uniform or random schedulers.

The mutator matters too. Coverage-guided fuzzing assumes small mutations produce small changes in program behavior, which is necessary for the fuzzer to incrementally discover new edges. Many harnesses bridge a byte-level mutator to typed inputs via the arbitrary crate, which decodes raw byte streams into structured values. While being general and easy-to-use, it is inherently destructive to this incremental process: a one-byte flip can re-decode into a structurally different input. Crucible mutates where inputs actually have structure: the sequence (action ordering, length, composition) and each action's typed parameters, drawn from declared ranges and biased toward boundary values.

Finally, throughput. Reaching a buggy code path is the first condition for triggering a bug; the second is raw executions. With multi-core stateful fuzzing, Crucible runs in the tens of thousands of executions per second on real Solana programs, with peaks past 100k once coverage is saturated and register-level tracing is disabled.

How Crucible Works

TestContext API

TestContext is a LiteSVM wrapper whose job is to collapse SVM boilerplate, expose cheat codes for time and account state, and let a harness drive a program with one-line builder calls. Here's a single delegate_stake transaction, before and after:

Before - LiteSVM

let ix = Instruction {
    program_id: stake_id,
    accounts: vec![
        AccountMeta::new(stake_account, false),
        AccountMeta::new_readonly(vote_account, false),
        AccountMeta::new_readonly(authority.pubkey(), true),
        // ... sysvars
    ],
    data: serialize(&StakeInstruction::DelegateStake),
};
let mut tx = Transaction::new_unsigned(Message::new(&[ix], Some(&payer.pubkey())));
tx.sign(&[&payer, &authority], svm.latest_blockhash());
let result = svm.send_transaction(tx);
// parse result, extract error code, etc.

After - Crucible


ctx.program(stake_id)
    .call(DelegateStake { })
    .accounts(DelegateAccounts { stake_account, vote_account, authority })
    .signers(&[&payer, &authority])
    .send()?;

The builder handles signing, encoding, and result parsing. Adjacent helpers such as ctx.create_account(), ctx.advance_epoch(), ctx.warp_to_slot(), account-reading and mint-setup utilities, cover the rest of what a harness needs. Across the stake harness's twenty-plus actions, this cuts boilerplate by 50–70%.

sBPF Edge Coverage

Using LiteSVM's register-level tracing, Crucible follows the sBPF execution trace and records newly-seen edges. An edge is a transition between basic blocks; any input that reaches a previously-unseen edge is added to the corpus and becomes a starting point for further mutation.

No program-side instrumentation is required. Solana programs can be fuzzed regardless of the implementation framework or language. The harness's typed call bindings come from a standard Anchor-format IDL, which is what Crucible needs as a soft-requirement to generate useful program call bindings. When compiling a program with debug symbols, this allows for generating program LCOV reports viewable with genhtml.

Specialized Mutator

As noted earlier, coverage feedback assumes small mutations produce small behavioral changes, and this assumption breaks on Solana inputs. When arbitrary decodes a byte stream into a Vec<Action>, flipping a single byte, such as a discriminant or length field, can result in an entirely different call sequence.

Crucible's mutator skips the byte layer entirely. A corpus entry is a Vec<Action> of typed structs, and mutations operate on it directly. Parameter mutations rewrite one field of one action, lamports: 1_000 → u64::MAX, vote_account: None → Some(2) with numeric values biased toward boundaries (0, u64::MAX, power-of-two edges, declared #[range(..)] endpoints). Sequence mutations treat actions as atomic units: insert, delete, swap adjacent, splice a prefix from another corpus entry. Every mutation produces a structurally valid sequence of program calls.

This gave roughly a 5x improvement in bug and coverage discovery rate over arbitrary.

Stateless and Stateful Modes

Crucible supports two fuzzing modes. They find the same class of bugs; they differ in how each iteration constructs the state under test.

Stateless mode. Each iteration clones the post-setup snapshot and executes a full mutated Vec<Action> against it. The mutator varies the sequence and each action's parameters as described above. Coverage feedback is per-sequence: sequences that reach a new edge survive into the corpus; others are discarded.

Stateful mode. Each iteration draws a live state from a coverage-indexed state pool and applies a single mutated action to it. If the resulting state reaches a new edge, it's snapshotted back into the pool for future iterations to mutate from. The fuzzer never re-executes a long chain to reach a deep state; it picks up where a previous iteration left off. State coverage distinguishes inputs that reach the same code through different program states and acts as a secondary signal in this mode. The tradeoff is memory: the pool grows with coverage, and held states must be cheap to clone.

Both modes scale across cores nearly linearly: workers share a coverage bitmap so edges discovered by one are immediately visible to the others. When tested on smaller programs like the official stake program with a Macbook Pro M3 we get the following performance:

	Single-core	Multi-core (12 cores)	Multi-core (12 cores, no-tracing)
Stateless	1,200 exec/s	9,600 exec/s	22,000 exec/s
Stateful	8,200 exec/s	67,000 exec/s	130,000 exec/s

Building the Harness

Now that we understand the stake bug and how Crucible works, here's how to replicate this bug from scratch.

Install the CLI:

> cargo install --git https://github.com/asymmetric-research/crucible crucible-fuzz-cli

We need a vulnerable stake program binary to fuzz against. The fix landed in early December 2025 (PR #198, commit ff89b6b4). Check out the parent commit and build:

git clone https://github.com/solana-program/stake.git
cd stake/program
git checkout 16ad96b5  # last commit before the fix
cargo build-sbf

This produces the compiled sBPF binary that LiteSVM will load: target/deploy/solana_stake_program.so.

Now scaffold the harness:

> crucible init stake-demo

The scaffold creates a standalone Cargo workspace at fuzz/stake-demo/. It's standalone to stay compatible with arbitrary Solana versions.

Drop the .so we just built into fuzz/stake-demo/, and drop an Anchor-format IDL for the stake program into idls/stake_anchor.json. Any Solana program with a standard IDL can be plugged in here to generate compatible call bindings. In the harness itself, the IDL is pulled in with a macro that produces typed instruction, accounts, and types modules at compile time:

crucible_idl_gen::declare_fuzz_program!("idls/stake_anchor.json");
use stake::{accounts, instruction, types};

The first pass simply contains a fixture struct, a setup, a handful of action signatures, and an invariant block. No implementations yet:

#[derive(Clone)]
struct StakeFuzzFixture {
    ctx: TestContext,
    // protocol-specific state goes here
}

#[fuzz_fixture]
impl StakeFuzzFixture {
    pub fn setup() -> Self { todo!() }

    pub fn action_advance_slots(&mut self, slots: u16) { todo!() }
    pub fn action_delegate_stake(&mut self, /* ... */) { todo!() }
    pub fn action_deactivate(&mut self, /* ... */) { todo!() }
    pub fn action_withdraw(&mut self, /* ... */) { todo!() }
}

#[invariant_test]
fn invariant_test(fixture: &mut StakeFuzzFixture) { todo!() }

The only field the framework itself requires is ctx: TestContext which wraps LiteSVM. Everything else is protocol-specific and we'll add it as we go. The #[fuzz_fixture] attribute auto-discovers any method prefixed action_ and wires it into the fuzzer's mutator so those methods can be called in randomized sequences.

Now we fill in setup(). This runs once and is snapshotted as the initial state every iteration starts from. Here we load the program, mint an admin, spin up vote and authority accounts, and initialize a couple of stake accounts. As we add each piece, we need somewhere to hold it between actions, and that's where the rest of the fixture struct comes from:

struct StakeFuzzFixture {
    ctx:                TestContext,            // LiteSVM wrapper
    program_id:         Pubkey,                 // cached stake program id
    admin:              Rc<Keypair>,            // funder + custodian signer
    sink:               Rc<Keypair>,            // recipient for withdrawals
    stake_accounts:     Vec<FuzzStakeAccount>,  // accounts the invariant iterates over
    vote_accounts:      Vec<Rc<Keypair>>,       // delegation targets
    authority_accounts: Vec<Rc<Keypair>>,       // staker / withdrawer signers
}

impl StakeFuzzFixture {
    pub fn setup() -> Self {
        let mut ctx = TestContext::new();
        ctx.add_program(&STAKE_PROGRAM_ID, "solana_stake_program.so").unwrap();

        // fund an admin
        let admin = Rc::new(Keypair::new());
        ctx.create_account()
            .pubkey(admin.pubkey())
            .lamports(100_000_000_000_000_000)
            .owner(system_program::ID)
            .create()
            .unwrap();

        // ... vote_accounts, authority_accounts ...

        // create and initialize each stake account
        let stake_accounts = (0..STAKE_ACCOUNTS).map(|i| {
            let keypair = Rc::new(Keypair::new());
            let lamports = 10_000_000_000u64 * (i as u64 + 1);

            ctx.create_account()
                .pubkey(keypair.pubkey())
                .lamports(lamports)
                .size(StakeStateV2::size_of())
                .owner(STAKE_PROGRAM_ID)
                .create()
                .unwrap();

            ctx.program(STAKE_PROGRAM_ID)
                .call(instruction::Initialize { authorized, lockup })
                .accounts(accounts::Initialize { stake: keypair.pubkey() })
                .signers(&[&*admin])
                .send()
                .expect("setup TX must not fail");

            FuzzStakeAccount { keypair, /* ... */ }
        }).collect();

        Self { ctx, admin, stake_accounts, /* ... */ }
    }
}

create_account() and program().call().accounts().signers().send() are TestContext's two main builders: one for raw account creation, one for instruction calls. Each replaces ~10–15 lines of manual AccountMeta construction, transaction signing, and result parsing.

Actions wrap a single program call plus any harness-side bookkeeping needed to keep the fixture coherent:

pub fn action_delegate_stake(
    &mut self,
    // #[range(..)] constrains the fuzzer's randomized value to the given range
    #[range(0..STAKE_ACCOUNTS)]     stake_account: usize,
    #[range(0..VOTE_ACCOUNTS)]      vote_account:  usize,
    #[range(0..AUTHORITY_ACCOUNTS)] authority:     usize,
) {
    self.ctx
        .program(self.program_id)
        .call(instruction::DelegateStake {})
        .accounts(accounts::DelegateStake {
            stake:           self.get_stake_pubkey(stake_account),
            vote:            self.vote_accounts[vote_account].pubkey(),
            unused:          STAKE_CONFIG_ID,
            stake_authority: self.authority_accounts[authority].pubkey(),
        })
        .signers(&[&*self.authority_accounts[authority]])
        .send()
        .ok();
}

The fuzzer provides usize values it has mutated within the declared range, which we use as indices into the fixture's keypair arrays. With only a few vote and authority accounts in the pool, the fuzzer naturally re-picks the same index as a prior delegation often enough to hit the rescind branch — which is exactly the precondition this bug needs.

action_withdraw shows the same idea for numeric parameters. The fuzzer generates a raw u64 withdrawal amount, and the harness clamps it into a range the stake program will accept.

pub fn action_withdraw(
    &mut self,
    #[range(0..STAKE_ACCOUNTS)]     stake_account: usize,
    #[range(0..AUTHORITY_ACCOUNTS)] authority:     usize,
    lamports:       u64,   // fuzzer-mutated, clamped to balance below
    leave_reserve:  bool,
) {
    let stake_pubkey = self.get_stake_pubkey(stake_account);
    let balance = self.ctx.svm.get_balance(&stake_pubkey).unwrap_or(0);
    let reserve = Rent::default()
        .minimum_balance(StakeStateV2::size_of())
        .saturating_add(LAMPORTS_PER_SOL);

    let cap = if leave_reserve { balance.saturating_sub(reserve) } else { balance };
    let l = lamports.min(cap);

    // send the Withdraw instruction with the clamped amount ...
}

Once the rest of our four actions are defined, the last piece is implementing the property they're tested against:

#[invariant_test]
fn invariant_test(fixture: &mut StakeFuzzFixture) {
    let clock = fixture.ctx.svm.get_sysvar::<Clock>();
    for idx in 0..STAKE_ACCOUNTS {
        let pubkey = fixture.get_stake_pubkey(idx);
        let Ok(account) = fixture.ctx.get_account(&pubkey) else { continue };
        let Ok(state) = account.deserialize_data::<StakeStateV2>() else { continue };
        let Some(stake) = state.stake() else { continue };

        // Only check fully-active stake. During cooldown, delegation.stake
        // legitimately exceeds lamports as stake drains over multiple epochs.
        let is_active = stake.delegation.deactivation_epoch == u64::MAX
            || stake.delegation.deactivation_epoch > clock.epoch;
        if !is_active { continue }

        fuzz_assert!(
            stake.delegation.stake <= account.lamports,
            "stake > lamports on account {}: stake={} lamports={}",
            idx, stake.delegation.stake, account.lamports,
        );
    }
}

The invariant only checks fully-active accounts. Accounts that don't exist, can't deserialize, or aren't in the Stake variant are skipped. Similarly, accounts in or past their deactivation epoch are skipped too. During cooldown, delegation.stake > lamports is expected behavior, since stake drains over multiple epochs while lamports become withdrawable immediately. Without that filter, every cooling-down account would trip the assertion with false positives.

Now we run it:

crucible run stake-demo invariant_test --release --stop-on-crash

On a single core, around 1,400 exec/sec for this harness. Every sequence that reaches new sBPF edges survives into the corpus, and the frontier expands. A couple of seconds in, Crucible prints:

[FUZZ_FINDING] stake > lamports on account 0: stake=9997717120 lamports=1002282880

=== FUZZ SEQUENCE (8 executed, 0 skipped) ===
  1. delegate_stake(stake_account=0, vote_account=0, authority=0) -> OK
  2. advance_slots(slots=23498) -> OK
  3. deactivate(stake_account=0, authority=0) -> OK
  4. withdraw(stake_account=0, lamports=65535, leave_reserve=true, authority=0) -> OK
  5. withdraw(stake_account=0, lamports=18446744073709551614, leave_reserve=true, authority=0) -> OK
  6. delegate_stake(stake_account=1, vote_account=1, authority=0) -> OK
  7. deactivate(stake_account=0, authority=0) -> OK
  8. delegate_stake(stake_account=0, vote_account=0, authority=0) -> OK [VIOLATION]

delegation.stake records 9,997,717,120 lamports (≈9.99 SOL), the account holds 1,002,282,880 lamports (≈1.00 SOL). The ~9 SOL gap is phantom stake with no SOL behind it. The fuzzer found the path but didn't minimize it: three of the eight actions are redundant.

crucible tmin minimizes the crash to the smallest reproducing sequence, and crucible show prints it:


1. delegate_stake(stake_account=0, vote_account=0, authority=0) -> OK
2. advance_slots(slots=23498) -> OK
3. deactivate(stake_account=0, authority=0) -> OK
4. withdraw(stake_account=0, lamports=18446744073709551614, leave_reserve=true, authority=0) -> OK
5. delegate_stake(stake_account=0, vote_account=0, authority=0) -> OK

These are the same five instructions from the opening of this post, rediscovered from an empty corpus in seconds.

The four-action demo is the smallest version that still finds the bug. The full stake harness has all twenty-plus instructions the stake program exposes: split, merge, authorize, initialize_checked, set_lockup, and the rest. Adding actions exponentially enlarges the search space, but it also lets the fuzzer find bugs that require any combination of them, not just the four we handpicked. For larger harnesses Crucible offers two performance flags:

crucible run stake-demo invariant_test --release --stateful --cores 12 --stop-on-crash

--stateful switches to stateful mode (described above). --cores 12 adds eleven more workers sharing the coverage bitmap. Together, throughput on this harness reaches ~67,000 exec/s, and the twenty-action production harness still finds the phantom stake bug in seconds.

--no-tracing disables sBPF register tracing entirely which pushes throughput past 100,000 exec/s at the cost of edge coverage guidance. Useful late in a campaign once coverage has saturated and the fuzzer is mostly re-exploring known paths.

What This Means

We caught this bug in a manual review, years after it went live. It survived SIMD review, mainnet traffic, and Agave's regression suite because none of those enumerate the space of instruction sequences. A five-step path through a twenty-action program isn't a case anyone writes a unit test for. The invariant (delegation.stake <= lamports) is concise and intuitive. Discovering a sequence that violates it is the hard part, and searching that space is why we built Crucible.

Crucible: https://github.com/asymmetric-research/crucible
Full harness: https://github.com/asymmetric-research/stake-demo