Developer Resources
Peer-to-Peer Layer
Find out how the Internet Computer’s peer-to-peer layer enables reliable, scalable, and secure communication between the nodes of a subnet
Secure Scalability
The Internet Computer’s Peer-to-Peer Layer
Scalability and the Efficiency of Message Distribution
The Internet Computer network is built for scalability. Its architecture allows Web3 dApps and services to infinitely scale.
The scalability of a blockchain predominantly relies on message distribution efficiency in the network. The larger a network grows, the more messages need to be distributed and the more complex the distribution process becomes. To solve this challenge, the Internet Computer is divided into subnets.
Each subnet in the network can be thought of as a smaller Internet Computer blockchain that is running canisters on its associated nodes. Subnets are split by the Network Nervous System to enable the growth of the network with rising demand.
The Four Layers of the Internet Computer Protocol
The IC protocol consists of four key layers:
Traditional peer-to-peer layers are confronted with the trilemma of having to make trade-offs between security, performance, and scalability. The peer-to-peer layer of the Internet Computer achieves a high level of security and enables subnets to scale with minimal performance trade-offs.
Additionally, a prioritization mechanism for messages enables important messages to be delivered faster and filters unwanted messages to save bandwidth.
These are the requirements under which aforementioned guarantee is provided:
- Application components/peers have reserved resources
- Artifact prioritization
- Bounded-time/eventual delivery unaffected by Byzantine faults
- Bounded resources
- DOS/SPAM resilience
- Authenticity, encryption, and integrity
Gossip mechanism
Messages in the subnet are distributed by a gossip mechanism that peer-to-peer uses. The gossip protocol sends messages a node has received or created to its peers in the subnet. This process can be likened to the spreading of rumors and an overlay network topology is utilized to determine the peers.
Byzantine fault-tolerance
The fault-tolerance of the peer-to-peer layer is guaranteed even if Byzantine nodes are present. In this context, a node is considered Byzantine if it exhibits faulty behavior (e.g. is unresponsive, suffers from network delays) or behaves maliciously (e.g. the protocol is not followed).
Preventing eclipse attacks
In an eclipse attack, all the peers of a given node are faulty or malicious. This allows malicious nodes to collude and thereby disconnect the node from the remaining network to influence which artifacts the node sees.
To prevent connectivity problems, overlays are being used to guarantee a node’s connectivity with its peers. As a result, a connection with high probability is formed by all honest nodes. In small subnets, nodes form a complete graph by connecting to all other nodes in the subnet which prohibits eclipse attacks. Sparser overlays are being used for subnets that are larger in size such as the Network Nervous System.
The gossip protocol ensures artifacts are delivered despite faulty node/links to guarantee all nodes receive the required artifacts within a certain time.
Using adverts to prevent the duplicate problem
Artifacts can be large in size and are often sent multiple times from multiple peers. Because of this, a network can experience a severe waste of bandwidth. To prevent this, adverts (i.e. small messages that only contain meta-data of artifacts) are sent first. Nodes process these adverts and then request an required artifact from at least one peer. If a node encounters a problem during this process, it will ask another peer for the artifact. This process is repeated until an honest peer is found.
Choosing what artifact request
If a node receives multiple adverts, it will have to choose which one to request first. A client that creates an advert provides given attributes such as block height. The gossip protocol is also being provided a priority function that returns a priority value for the corresponding artifact. The lowest value is “drop,” indicating that the artifact is not needed. The highest value assigns a top priority to the artifact via the value “fetch now.”
Received artifacts are stored by peer-to-peer in the artifact pool. It also informs consensus and client components if there are changes in the pool. Based on the information, the next actions to the pool’s content are determined by the application component.
Artifacts are either categorized as “validated” or “unvalidated.” An artifact that has been validated was checked by the client component and has its signatures verified.
Gossip data structures (at the node level)
The following graphic depicts the data structures nodes maintain for the gossip protocol.
The artifact pool is separated into two sections that differentiate between validated and unvalidated artifacts. To prevent DOS attacks and resource leaks, there is an upper bound for the size of the sections with the unvalidated artifacts. This helps to prevent malicious actors to fill up the artifact pool. The size of the sections, however, is still large enough to make sure the protocol is operating correctly.
For each peer, the following data structures are maintained:
Handling main events
The four main events handled by the gossip protocol are:
- Addition of new artifact into the pool (locally added by client component)
- Processing of new advert (received from a peer)
- Processing of new artifact (received from a peer)
- Issues with recovery and reconnection
Let’s have a look at each event in more detail.
New artifact is locally added to the pool by client component
The gossip protocol creates an advert when a node in the subnet receives a new artifact from a client component. The advert is then sent to all of the node’s peers.
Processing of new advert (received from a peer)
The following describes the process of what happens when a node receives an advert:
- Node receives advert from a peer
- Node checks if corresponding artifact was created by itself or if it was already downloaded by another node
- Node checks the priority of the advert and adds it to the adverts queue of the sending peer if the priority is higher than “drop”
- The
download_next(i)
function is called specifically for the sending peer if the space of the artifact pool’s unvalidated section is lrge enough for the peer - The function fetches the advert with the highest priority
- The function requests the corresponding artifact from the peer. This step helps the gossip protocol to avoid requesting the artifact for which it just received an advert. Instead, it will simply request the highest priority artifact
- Once the artifact has been requested, the advert gets moved from the adverts queue to the peer’s requested set
Processing of new artifact (received from a peer)
The following describes the process of what happens when a node receives an artifact:
- Node checks if artifact was requested by checking the peer’s requested set for the corresponding advert
- Corresponding integrity hash is used to verify integrity of artifact
- The peer is misbehaving if any of the checks fail
- Advert is eventually removed from advert queues and requested sets
- Advert is added to the unvalidated pool of the sending peer
- It remains there until client component checks the advert and validates it
- A hash of the artifact is added to a small cache (called received check) so that further adverts for the same artifact are ignored
Transport and Connection Management
Network connections between peers are maintained by a transport component situated below the gossip protocol.
The transport component is tasked with keeping the connection between peers stable. Buffers are in place should connectivity congestion problems arise. Internal mechanisms ensure that longer-than-usual delays are detected and to avoid hanging connections.
Gossip messages are framed by the transport component with its own layer 7 header containing metadata fields used to maintain message flows and to report errors. Multiple TCP streams are currently used between peers by the component.
The transport component periodically attempts to reconnect should a TCP connection be interrupted. This is only done as long as the node’s assigned overlay includes the corresponding peer.
Maxed out buffers
The transport data structures are bounded in size like all other data structures in the network. If a buffer reaches maximum capacity or if the component is waiting for a reconnection, the receiver gossip component will eventually be notified by transport about a potential message drop. The receiving peer can request retransmission as a response to the notification.
When the retransmission request has been received, all the relevant adverts are sent by the sender node based on the filter included in the request. The transport component will then forward these adverts to the receiver.