Complex Golang VM
In this tutorial, we'll walk through how to build a virtual machine by referencing the BlobVM.
The BlobVM is a virtual machine that can be used to implement a decentralized key-value store. A blob (shorthand for "binary large object") is an arbitrary piece of data.
BlobVM stores a key-value pair by breaking it apart into multiple chunks stored with their hashes as their keys in the blockchain. A root key-value pair has references to these chunks for lookups. By default, the maximum chunk size is set to 200 KiB.
Components
A VM defines how a blockchain should be built. A block is populated with a set of transactions which mutate the state of the blockchain when executed. When a block with a set of transactions is applied to a given state, a state transition occurs by executing all of the transactions in the block in-order and applying it to the previous block of the blockchain. By executing a series of blocks chronologically, anyone can verify and reconstruct the state of the blockchain at an arbitrary point in time.
The BlobVM repository has a few components to handle the lifecycle of tasks from a transaction being issued to a block being accepted across the network:
- Transaction: A state transition
- Mempool: Stores pending transactions that haven't been finalized yet
- Network: Propagates transactions from the mempool other nodes in the network
- Block: Defines the block format, how to verify it, and how it should be accepted or rejected across the network
- Block Builder: Builds blocks by including transactions from the mempool
- Virtual Machine: Application-level logic. Implements the VM interface needed to interact with Avalanche consensus and defines the blueprint for the blockchain.
- Service: Exposes APIs so users can interact with the VM
- Factory: Used to initialize the VM
Lifecycle of a Transaction
A VM will often times expose a set of APIs so users can interact with the it. In the blockchain, blocks can contain a set of transactions which mutate the blockchain's state. Let's dive into the lifecycle of a transaction from its issuance to its finalization on the blockchain.
- A user makes an API request to
service.IssueRawTx
to issue their transaction. This API will deserialize the user's transaction and forward it to the VM - The transaction is submitted to the VM which is then added to the VM's mempool
- The VM asynchronously periodically gossips new transactions in its mempool to other nodes in the network so they can learn about them
- The VM sends the Avalanche consensus engine a message to indicate that it has transactions in the mempool that are ready to be built into a block
- The VM proposes the block with to consensus
- Consensus verifies that the block is valid and well-formed
- Consensus gets the network to vote on whether the block should be accepted or rejected. If a block is rejected, its transactions are reclaimed by the mempool so they can be included in a future block. If a block is accepted, it's finalized by writing it to the blockchain.
Coding the Virtual Machine
We'll dive into a few of the packages that are in the The BlobVM repository to learn more about how they work:
vm
block_builder.go
chain_vm.go
network.go
service.go
vm.go
chain
unsigned_tx.go
base_tx.go
transfer_tx.go
set_tx.go
tx.go
block.go
mempool.go
storage.go
builder.go
mempool
mempool.go
Transactions
The state the blockchain can only be mutated by getting the network to accept a signed transaction. A signed transaction contains the transaction to be executed alongside the signature of the issuer. The signature is required to cryptographically verify the sender's identity. A VM can define an arbitrary amount of unique transactions types to support different operations on the blockchain. The BlobVM implements two different transactions types:
- TransferTx - Transfers coins between accounts.
- SetTx - Stores a key-value pair on the blockchain.
UnsignedTransaction
All transactions in the BlobVM implement the common UnsignedTransaction
interface, which exposes shared functionality for all transaction types.
BaseTx
Common functionality and metadata for transaction types are implemented by BaseTx
.
SetBlockID
sets the transaction's block ID.GetBlockID
returns the transaction's block ID.SetMagic
sets the magic number. The magic number is used to differentiate chains to prevent replay attacksGetMagic
returns the magic number. Magic number is defined in genesis.SetPrice
sets the price per fee unit for this transaction.GetPrice
returns the price for this transaction.FeeUnits
returns the fee units this transaction will consume.LoadUnits
identical toFeeUnits
ExecuteBase
executes common validation checks across different transaction types. This validates the transaction contains a valid block ID, magic number, and gas price as defined by genesis.
TransferTx
TransferTx
supports the transfer of tokens from one account to another.
TransferTx
embeds BaseTx
to avoid re-implementing common operations with other transactions, but implements its own Execute
to support token transfers.
This performs a few checks to ensure that the transfer is valid before transferring the tokens between the two accounts.
SetTx
SetTx
is used to assign a value to a key.
SetTx
implements its own FeeUnits
method to compensate the network according to the size of the blob being stored.
SetTx
's Execute
method performs a few safety checks to validate that the blob meets the size constraints enforced by genesis and doesn't overwrite an existing key before writing it to the blockchain.
Signed Transaction
The unsigned transactions mentioned previously can't be issued to the network without first being signed. BlobVM implements signed transactions by embedding the unsigned transaction alongside its signature in Transaction
. In BlobVM, a signature is defined as the ECDSA signature of the issuer's private key of the KECCAK256 hash of the unsigned transaction's data (digest hash).
The Transaction
type wraps any unsigned transaction. When a Transaction
is executed, it calls the Execute
method of the underlying embedded UnsignedTx
and performs the following sanity checks:
- The underlying
UnsignedTx
must meet the requirements set by genesis. This includes checks to make sure that the transaction contains the correct magic number and meets the minimum gas price as defined by genesis - The transaction's block ID must be a recently accepted block
- The transaction must not be a recently issued transaction
- The issuer of the transaction must have enough gas
- The transaction's gas price must be meet the next expected block's minimum gas price
- The transaction must execute without error
If the transaction is successfully verified, it's submitted as a pending write to the blockchain.
Example
Let's walk through an example on how to issue a SetTx
transaction to the BlobVM to write a key-value pair.
-
Create the unsigned transaction for
SetTx
-
Calculate the digest hash for the transaction.
-
Sign the digest hash with the issuer's private key.
-
Create and initialize the new signed transaction.
-
Issue the request with the client
Mempool
Overview
The mempool is a buffer of volatile memory that stores pending transactions. Transactions are stored in the mempool whenever a node learns about a new transaction either through gossip with other nodes or through an API call issued by a user.
The mempool is implemented as a min-max heap ordered by each transaction's gas price. The mempool is created during the initialization of VM.
Whenever a transaction is submitted to VM, it first gets initialized, verified, and executed locally. If the transaction looks valid, then it's added to the mempool.
Add Method
When a transaction is added to the mempool, Add
is called. This performs the following:
- Checks if the transaction being added already exists in the mempool or not
- The transaction is added to the min-max heap
- If the mempool's heap size is larger than the maximum configured value, then the lowest paying transaction is evicted
- The transaction is added to the list of transactions that are able to be gossiped to other peers
- A notification is sent through the in the
mempool.Pending
channel to indicate that the consensus engine should build a new block
Block Builder
Overview
The TimeBuilder
implementation for BlockBuilder
acts as an intermediary notification service between the mempool and the consensus engine. It serves the following functions:
- Periodically gossips new transactions to other nodes in the network
- Periodically notifies the consensus engine that new transactions from the mempool are ready to be built into blocks
TimeBuilder
and can exist in 3 states:
dontBuild
- There are no transactions in the mempool that are ready to be included in a blockbuilding
- The consensus engine has been notified that it should build a block and there are currently transactions in the mempool that are eligible to be included into a blockmayBuild
- There are transactions in the mempool that are eligible to be included into a block, but the consensus engine has not been notified yet
Gossip Method
The Gossip
method initiates the gossip of new transactions from the mempool at periodically as defined by vm.config.GossipInterval
.
Build Method
The Build
method consumes transactions from the mempool and signals the consensus engine when it's ready to build a block.
If the mempool signals the TimeBuilder
that it has available transactions, TimeBuilder
will signal consensus that the VM is ready to build a block by sending the consensus engine a common.PendingTxs
message.
When the consensus engine receives the common.PendingTxs
message it calls the VM's BuildBlock
method. The VM will then build a block from eligible transactions in the mempool.
If there are still remaining transactions in the mempool after a block is built, then the TimeBuilder
is put into the mayBuild
state to indicate that there are still transactions that are eligible to be built into block, but the consensus engine isn't aware of it yet.
Network
Network handles the workflow of gossiping transactions from a node's mempool to other nodes in the network.
GossipNewTxs Method
GossipNewTxs
sends a list of transactions to other nodes in the network. TimeBuilder
calls the network's GossipNewTxs
function to gossip new transactions in the mempool.
Recently gossiped transactions are maintained in a cache to avoid DDoSing a node from repeated gossip failures.
Other nodes in the network will receive the gossiped transactions through their AppGossip
handler. Once a gossip message is received, it's deserialized and the new transactions are submitted to the VM.
Block
Blocks go through a lifecycle of being proposed by a validator, verified, and decided by consensus. Upon acceptance, a block will be committed and will be finalized on the blockchain.
BlobVM implements two types of blocks, StatefulBlock
and StatelessBlock
.
StatefulBlock
A StatefulBlock
contains strictly the metadata about the block that needs to be written to the database.
StatelessBlock
StatelessBlock is a superset of StatefulBlock
and additionally contains fields that are needed to support block-level operations like verification and acceptance throughout its lifecycle in the VM.
Let's have a look at the fields of StatelessBlock:
StatefulBlock
: The metadata about the block that will be written to the database upon acceptancebytes
: The serialized form of theStatefulBlock
.id
: The Keccak256 hash ofbytes
.st
: The status of the block in consensus (i.eProcessing
,Accepted
, orRejected
)children
: The children of this blockonAcceptDB
: The database this block should be written to upon acceptance.
When the consensus engine tries to build a block by calling the VM's BuildBlock
, the VM calls the block.NewBlock
function to get a new block that is a child of the currently preferred block.
Some StatelessBlock
fields like the block ID, byte representation, and timestamp aren't populated immediately. These are set during the StatelessBlock
's init
method, which initializes these fields once the block has been populated with transactions.
To build the block, the VM will try to remove as many of the highest-paying transactions from the mempool to include them in the new block until the maximum block fee set by genesis is reached.
A block once built, can exist in two states:
- Rejected: The block was not accepted by consensus. In this case, the mempool will reclaim the rejected block's transactions so they can be included in a future block.
- Accepted: The block was accepted by consensus. In this case, we write the block to the blockchain by committing it to the database.
When the consensus engine receives the built block, it calls the block's Verify
method to validate that the block is well-formed. In BlobVM, the following constraints are placed on valid blocks:
-
A block must contain at least one transaction and the block's timestamp must be within 10s into the future.
-
The sum of the gas units consumed by the transactions in the block must not exceed the gas limit defined by genesis.
-
The parent block of the proposed block must exist and have an earlier timestamp.
-
The target block price and minimum gas price must meet the minimum enforced by the VM.
After the results of consensus are complete, the block is either accepted by committing the block to the database or rejected by returning the block's transactions into the mempool.
API
Service implements an API server so users can interact with the VM. The VM implements the interface method CreateHandlers
that exposes the VM's RPC API.
One API that's exposed is IssueRawTx
to allow users to issue transactions to the BlobVM. It accepts an IssueRawTxArgs
that contains the transaction the user wants to issue and forwards it to the VM.
Virtual Machine
We have learned about all the components used in the BlobVM. Most of these components are referenced in the vm.go
file, which acts as the entry point for the consensus engine as well as users interacting with the blockchain.
For example, the engine calls vm.BuildBlock()
, that in turn calls chain.BuildBlock()
. Another example is when a user issues a raw transaction through service APIs, the vm.Submit()
method is called.
Let's look at some of the important methods of vm.go
that must be implemented:
Initialize Method
Initialize is invoked by avalanchego
when creating the blockchain. avalanchego
passes some parameters to the implementing VM.
ctx
- Metadata about the VM's executiondbManager
- The database that the VM can write togenesisBytes
- The serialized representation of the genesis state of this VMupgradeBytes
- The serialized representation of network upgradesconfigBytes
- The serialized VM-specific configurationtoEngine
- The channel used to send messages to the consensus enginefxs
- Feature extensions that attach to this VMappSender
- Used to send messages to other nodes in the network
BlobVM upon initialization persists these fields in its own state to use them throughout the lifetime of its execution.
After initializing its own state, BlobVM also starts asynchronous workers to build blocks and gossip transactions to the rest of the network.
GetBlock Method
GetBlock
returns the block with the provided ID. GetBlock will attempt to fetch the given block from the database, and return an non-nil error if it wasn't able to get it.
ParseBlock Method
ParseBlock
deserializes a block.
BuildBlock Method
Avalanche consensus calls BuildBlock
when it receives a notification from the VM that it has pending transactions that are ready to be issued into a block.
SetPreference Method
SetPreference
sets the block ID preferred by this node. A node votes to accept or reject a block based on its current preference in consensus.
LastAccepted Method
LastAccepted returns the block ID of the block that was most recently accepted by this node.
CLI
BlobVM implements a generic key-value store, but support to read and write arbitrary files into the BlobVM blockchain is implemented in the blob-cli
To write a file, BlobVM breaks apart an arbitrarily large file into many small chunks. Each chunk is submitted to the VM in a SetTx
. A root key is generated which contains all of the hashes of the chunks.
Example 1
Given the root hash, a file can be looked up by deserializing all of its children chunk values and reconstructing the original file.
Example 2
Conclusion
This documentation covers concepts about Virtual Machine by walking through a VM that implements a decentralized key-value store.
You can learn more about the BlobVM by referencing the README in the GitHub repository.
Last updated on