hotstuff_rs::app

Trait App

source
pub trait App<K: KVStore>: Send {
    // Required methods
    fn produce_block(
        &mut self,
        request: ProduceBlockRequest<'_, K>,
    ) -> ProduceBlockResponse;
    fn validate_block(
        &mut self,
        request: ValidateBlockRequest<'_, '_, K>,
    ) -> ValidateBlockResponse;
    fn validate_block_for_sync(
        &mut self,
        request: ValidateBlockRequest<'_, '_, K>,
    ) -> ValidateBlockResponse;
}
Expand description

Trait implemented by applications that are to be replicated by HotStuff-rs.

§Determinism requirements

In order for the state machine to be correctly replicated across all replicas, types that implement App must ensure that their implementations of produce_block, validate_block, and validate_block_for_sync are deterministic. That is, for any particular request (e.g., ProduceBlockRequest), a function must always produce the same response (ProduceBlockResponse), regardless of whether the machine executing the call has an Intel CPU or an AMD CPU, whether a pseudorandom number generator spits out 0 or 1, or whether it is currently raining outside, etc.

The key to making sure that these methods are deterministic is ensuring that their respective method bodies depend on and only on the information that is available through the public methods of their request type. In particular, this means not reading the block tree through a BlockTreeSnapshot, but reading it only through AppBlockTreeView.

§Timing requirements

HotStuff-rs calls produce_block when a replica has to produce a block, and calls validate_block when a replica has to validate a block.

These calls are blocking (synchronous) calls, so a replica will not continue making progress until they complete. Because of this, library users should implement produce_block and validate_block satisfy a specific timing constraint so that replicas do not spend too long in these functions and cause views to timeout and progress to stall. This timing constraint is exactly:

4 * EWNL + produce_block_duration + validate_block_duration < max_view_time,

where:

  • EWNL: “Expected Worst-case Network Latency”. That is, the maximum duration of time needed to send a message from a validator to another validator under “normal” network conditions.
  • produce_block_duration: how long produce_block takes to execute.
  • validate_block_duration: how long validate_block takes to execute.
  • max_view_time: the provided Configuration::max_view_time.

§Rationale behind timing requirements

Let’s say a view begins at time T for the proposer of the view, and assume that:

  • Messages take EWNL to deliver between validators,
  • The time taken to process messages is negligible besides produce_block_duration and validate_block_duration, and
  • Every step begins after the previous step has ended (this is a simplifying assumption. In reality, for example, the Proposer does not have to wait for replicas to receive AdvanceView in order to start producing a block, and therefore, step 3 should really happen at T+diff(EWNL, produce_block_duration) instead of at T+EWNL+produce_block_duration as below).

Then the view will proceed as follows:

Step No.Time after T (cumulative)Events
1+0
  • Proposer enters view.
  • Proposer broadcasts AdvanceView.
  • Proposer broadcasts Proposal
2+EWNL
  • Replicas receive AdvanceView.
  • Replicas enter view.
3+produce_block_duration
  • Proposer enters view.
4+EWNL
  • Replicas receive Proposal.
5+validate_block_duration
  • Replicas send PhaseVotes.
6+EWNL
  • Next Leader collects PhaseCertificate.
  • Next Leader leaves view.
  • Next Leader broadcasts AdvanceView.
  • Next Leader broadcasts Proposal
7+EWNL
  • Replicas receive AdvanceView.
  • Replicas leave view.

In the view above (and indeed in any view), there are two possible cases about the identities of Proposer and Next Leader:

  1. If Proposer == Next Leader, then:
    1. The Proposer/Next Leader enters the view at step no. 1 and leaves at step no. 6, spending 3EWNL + produce_block_duration + validate_block_duration in the view.
    2. Other replicas enter the view at step no. 2 and leave at step no. 7, spending 3EWNL + produce_block_duration + validate_block_duration in the view.
  2. If Proposer != Next Leader, then:
    1. Proposer enters the view at step no.1 and leaves at step no. 7, spending 4EWNL + produce_block_duration + validate_block_duration in the view.
    2. Next Leader enters the view at step no. 2 and leaves at step no. 6, spending 2EWNL + produce_block_duration + validate_block_duration in the view.
    3. Other replicas enter the view at step no. 2 and leave at step no. 7, spending 3EWNL + produce_block_duration + validate_block_duration in the view.

This shows that the most time a replica will spend in a view under normal network conditions is 4EWNL + produce_block_duration + validate_block_duration (case 2.1), which is exactly the left-hand-side of the timing constraint inequality.

Required Methods§

source

fn produce_block( &mut self, request: ProduceBlockRequest<'_, K>, ) -> ProduceBlockResponse

Called by HotStuff-rs when the replica becomes a leader and has to produce a new Block to be inserted into the block tree and proposed to other validators.

source

fn validate_block( &mut self, request: ValidateBlockRequest<'_, '_, K>, ) -> ValidateBlockResponse

Called by HotStuff-rs when the replica receives a Proposal and has to validate the Block inside it to decide whether or not it should insert it into the block tree and vote for it.

source

fn validate_block_for_sync( &mut self, request: ValidateBlockRequest<'_, '_, K>, ) -> ValidateBlockResponse

Called when the replica is syncing and receives a BlockSyncResponse and has to validate the Block inside it to decide whether or not it should insert it into the block tree and vote for it.

§Purpose

There are several reasons why implementations of App may want calls to validate_block to take at least a certain amount of time.

For example, implementors may want to limit the rate at which views proceed (e.g., to prevent the block tree from growing too quickly and exhausting disk space), and may choose to do so by making their implementations of produce_block and validate_block cause a thread sleep until a minimum amount of time is reached.

Such a thread sleep solution works to slow down consensus decisions by, among other things, causing replicas to block for the amount of time when receiving a block through a Proposal, delaying the sending of a PhaseVote to the next leader.

However, such a solution will not only slow down consensus decisions, it will also slow down Block Sync. In general, implementors will want validate_block to take at least a minimum amount of time when it is called within a view, but complete as quickly as possible when called during block sync.

This is where validate_block_for_sync comes in. With validate_block_for_sync, implementers can rely on the fact that validate_block will only be called within a view, adding a thread sleep or other mechanism in it will not slow down sync as long as the mechanism is not also added to validate_block_for_sync.

Implementors§