pub trait App<K: KVStore>: Send {
// Required methods
fn produce_block(
&mut self,
request: ProduceBlockRequest<'_, K>,
) -> ProduceBlockResponse;
fn validate_block(
&mut self,
request: ValidateBlockRequest<'_, '_, K>,
) -> ValidateBlockResponse;
fn validate_block_for_sync(
&mut self,
request: ValidateBlockRequest<'_, '_, K>,
) -> ValidateBlockResponse;
}
Expand description
Trait implemented by applications that are to be replicated by HotStuff-rs.
§Determinism requirements
In order for the state machine to be correctly replicated across all replicas, types that implement
App
must ensure that their implementations of produce_block
, validate_block
, and
validate_block_for_sync
are deterministic. That is, for any particular request (e.g.,
ProduceBlockRequest
), a function must always produce the same response
(ProduceBlockResponse
), regardless of whether the machine executing the call has an Intel CPU
or an AMD CPU, whether a pseudorandom number generator spits out 0 or 1, or whether it is currently
raining outside, etc.
The key to making sure that these methods are deterministic is ensuring that their respective
method bodies depend on and only on the information that is available through the public methods
of their request type. In particular, this means not reading the block tree through a
BlockTreeSnapshot
, but reading it only through AppBlockTreeView
.
§Timing requirements
HotStuff-rs calls produce_block
when a replica has to produce a block, and calls validate_block
when a replica has to validate a block.
These calls are blocking (synchronous) calls, so a replica will not continue making progress
until they complete. Because of this, library users should implement produce_block
and
validate_block
satisfy a specific timing constraint so that replicas do not spend too long in
these functions and cause views to timeout and progress to stall. This timing constraint is
exactly:
4 * EWNL + produce_block_duration + validate_block_duration < max_view_time
,
where:
EWNL
: “Expected Worst-case Network Latency”. That is, the maximum duration of time needed to send a message from a validator to another validator under “normal” network conditions.produce_block_duration
: how longproduce_block
takes to execute.validate_block_duration
: how longvalidate_block
takes to execute.max_view_time
: the providedConfiguration::max_view_time
.
§Rationale behind timing requirements
Let’s say a view begins at time T for the proposer of the view, and assume that:
- Messages take
EWNL
to deliver between validators, - The time taken to process messages is negligible besides
produce_block_duration
andvalidate_block_duration
, and - Every step begins after the previous step has ended (this is a simplifying assumption. In reality,
for example, the Proposer does not have to wait for replicas to receive
AdvanceView
in order to start producing a block, and therefore, step 3 should really happen atT+diff(EWNL, produce_block_duration)
instead of atT+EWNL+produce_block_duration
as below).
Then the view will proceed as follows:
Step No. | Time after T (cumulative) | Events |
---|---|---|
1 | +0 |
|
2 | +EWNL |
|
3 | +produce_block_duration |
|
4 | +EWNL |
|
5 | +validate_block_duration |
|
6 | +EWNL |
|
7 | +EWNL |
|
In the view above (and indeed in any view), there are two possible cases about the identities of Proposer and Next Leader:
- If Proposer == Next Leader, then:
- The Proposer/Next Leader enters the view at step no. 1 and leaves at step no. 6, spending
3EWNL + produce_block_duration + validate_block_duration
in the view. - Other replicas enter the view at step no. 2 and leave at step no. 7, spending
3EWNL + produce_block_duration + validate_block_duration
in the view.
- The Proposer/Next Leader enters the view at step no. 1 and leaves at step no. 6, spending
- If Proposer != Next Leader, then:
- Proposer enters the view at step no.1 and leaves at step no. 7, spending
4EWNL + produce_block_duration + validate_block_duration
in the view. - Next Leader enters the view at step no. 2 and leaves at step no. 6, spending
2EWNL + produce_block_duration + validate_block_duration
in the view. - Other replicas enter the view at step no. 2 and leave at step no. 7, spending
3EWNL + produce_block_duration + validate_block_duration
in the view.
- Proposer enters the view at step no.1 and leaves at step no. 7, spending
This shows that the most time a replica will spend in a view under normal network conditions is
4EWNL + produce_block_duration + validate_block_duration
(case 2.1), which is exactly the
left-hand-side of the timing constraint inequality.
Required Methods§
sourcefn produce_block(
&mut self,
request: ProduceBlockRequest<'_, K>,
) -> ProduceBlockResponse
fn produce_block( &mut self, request: ProduceBlockRequest<'_, K>, ) -> ProduceBlockResponse
Called by HotStuff-rs when the replica becomes a leader and has to produce a new Block
to be
inserted into the block tree and proposed to other validators.
sourcefn validate_block(
&mut self,
request: ValidateBlockRequest<'_, '_, K>,
) -> ValidateBlockResponse
fn validate_block( &mut self, request: ValidateBlockRequest<'_, '_, K>, ) -> ValidateBlockResponse
Called by HotStuff-rs when the replica receives a Proposal
and has to validate the Block
inside
it to decide whether or not it should insert it into the block tree and vote for it.
sourcefn validate_block_for_sync(
&mut self,
request: ValidateBlockRequest<'_, '_, K>,
) -> ValidateBlockResponse
fn validate_block_for_sync( &mut self, request: ValidateBlockRequest<'_, '_, K>, ) -> ValidateBlockResponse
Called when the replica is syncing and receives a BlockSyncResponse
and has to validate the
Block
inside it to decide whether or not it should insert it into the block tree and vote for it.
§Purpose
There are several reasons why implementations of App
may want calls to validate_block
to take
at least a certain amount of time.
For example, implementors may want to limit the rate at which views proceed (e.g., to prevent the
block tree from growing too quickly and exhausting disk space), and may choose to do so by making
their implementations of produce_block
and validate_block
cause a thread sleep until a minimum
amount of time is reached.
Such a thread sleep solution works to slow down consensus decisions by, among other things, causing
replicas to block for the amount of time when receiving a block through a Proposal
, delaying
the sending of a PhaseVote
to the next leader.
However, such a solution will not only slow down consensus decisions, it will also slow down
Block Sync. In general, implementors will want validate_block
to take at least a minimum amount of
time when it is called within a view, but complete as quickly as possible when called during block
sync.
This is where validate_block_for_sync
comes in. With validate_block_for_sync
, implementers can
rely on the fact that validate_block
will only be called within a view, adding a thread sleep
or other mechanism in it will not slow down sync as long as the mechanism is not also added to
validate_block_for_sync
.