Arweave is a global network storing data and information. Arweave’s unique solution in the realm of decentralized storage ensures the integrity and permanence of the data stored on it. But what makes Arweave's approach different? Let’s explore the mechanism behind how information is verified by network participants and stored on the network.
Although this blog is structured in an FAQ format, the questions are organized in a logical progression, starting from a user making a new transaction request to its eventual upload on the network.
The Arweave Mining Process
Visit here to view this sheet in higher resolution.
What is mining?
Mining is like solving a complex cryptography puzzle using a trial and error methodology to compute a unique solution (hash). The node (or miner) that successfully solves the puzzle gets to add new information to the network and in return receives rewards in the form of AR tokens (Arweave’s native currency).
What is the current mining mechanism used by Arweave?
The current mechanism used is Succinct Proof of Random Access (SPoRA).
How do users interact with the network?
Users typically utilize gateways, such as g8way.io, to submit transaction requests. These requests are of two types: funds transfers or data uploads to the network. Each requires a one-time low-cost fee in AR tokens. The miner that successfully adds the transaction to the network receives 5% of this fee as a reward while the rest goes in a storage endowment for future use. In this article, we will focus on the data upload process. The gateways forward the requests to a node that communicates with the network on behalf of the user, removing the need for a user to have specialized software or in-depth knowledge to access the network's advantages.
How can Arweave fit vast amounts of data in a single block?
When a request is made, it's divided into two parts: transaction headers (metadata) and data. While the headers get added to the current block, the data can trickle into future blocks maintaining a verifiable link to the headers. This splitting enables arbitrary amounts of information to be added to Arweave through a single block. Data is cached until fully uploaded, potentially spanning several blocks for large datasets.
Why do nodes on Arweave share new transaction headers with each other?
Nodes are incentivised to share new transaction headers with other nodes, and in return, they also receive information from these nodes. This mutual exchange ensures that the successful node gets a larger reward for adding more data to a block.
What is a Succinct Proof on Arweave?
Each hash computation (attempt at solving the cryptographic puzzle) requires a corresponding storage proof from random historical data (data stored on Arweave in previous blocks). This proof is called a succinct proof. It comprises a random 256kB recall chunk and a merkle path (similar to a file path for data chunks) pointing to the specific chunk in the network.
How is data managed (stored) on Arweave?
All data on Arweave is stored in sequential 256kB chunks. Additionally, the full dataset is partitioned into 3.6TB intervals. Nodes can choose select partitions to begin mining but are incentivized to store the full dataset. This ensures that nodes can readily produce succinct proofs for any random historical data requested by the network, boosting their chances at successfully mining a block and adding new information to the network.
Data from the new requests is chunked at the time of transaction creation before being posted to the network. The node receiving the request, chunks the data and packs each chunk to their mining address (a node’s unique identifier on the network).
Once the transaction request has been accepted by the network, this node will act as a seed to share this new information with other nodes.
What is the Verifiable Delay Function (VDF) on Arweave?
The Verifiable Delay Function (VDF) is a cryptographic clock that provides a seed (a unique identifier) to every node each second.
How do nodes on Arweave identify the random recall proof?
Nodes use the seed received from the VDF to pinpoint a 100MB recall data range per stored partition, potentially containing the proof for the correct hash computation. Each 100MB range consists of 400 chunks (of 256 kB) of data giving miners 400 tries per second for each stored partition (to solve the cryptographic puzzle).
However, it is incredibly rare to find a solution in these first set of attempts. Hence, the provided seed pinpoints to another 100MB range in a random partition, which nodes may or may not have stored. Storing the entire dataset guarantees nodes access to a second set of 400 attempts per stored partition at computing the winning hash (solving the cryptographic puzzle).
How is Arweave incentivizing storage-based mining?
As part of Arweave's move away from energy-inefficient compute-based mining, the VDF limits nodes to maximum 800 attempts (200MB recall ranges) per second for each stored partition, as seen earlier, making expensive storage devices with speeds surpassing 200 MB/s redundant.
Taking another step away from compute-based mining, computation on Arweave only requires general purpose hardware. Avoiding costly, energy-inefficient rigs offers all nodes an equal opportunity, paving the way for a more decentralized network.
What happens after new data has been accepted by the Arweave network?
Once a successful hash is computed, the corresponding proof is added to the block alongside the new information. Other nodes can unpack and validate this proof and information using the node’s mining address with which it was encoded (the node that received the request).
Nodes are pushed to store new data due to the recall data-driven mining incentives. The gateways used to make transaction requests also serve as seeds to several nodes, sharing data from their cache, enhancing the speed of replication to promptly achieve data redundancy.
Arweave’s Sustainable Approach to data storage
Arweave's innovative approach to data storage is a game-changer. By shifting towards storage incentivized mining mechanisms, Arweave not only ensures decentralized permanent data storage but also promotes an eco-friendly approach to data preservation. In today's data-driven environment, Arweave provides a consistent and dependable storage solution.
Have more questions on Arweave? Our blogs on “What is Arweave?” and “How to interact with Arweave?” are a great starting point!
And make sure to follow Community Labs on X for more insights into this evolving ecosystem. Interested in exploring more or contributing an idea? Join us at the Community Labs Discord community!