How data redundancy works on the Arweave network

Redundancy Header

I. Introduction

Arweave is engineered for long term data storage. Storage that can outlast our current understanding of data preservation. To achieve this, the data must be secure and redundant, without a single source of failure. This is done through a unique mining algorithm that requires each storage node to prove that it is storing as much of the blockweave as it can. Every block confirms transactions while proving that data is being stored. This is where SPoRA, or Succinct Proofs of Random Access, comes in.

II. Arweave’s protocol design

Arweave is a decentralized network of nodes that store data permanently. When a user uploads data, they pay a one-time fee. That file is then split into chunks, transmitted across the network, and replicated by nodes that voluntarily pull it after it’s confirmed in a block.

Miners are paid for hashing and proving they store real data. Mining is designed to create a natural incentive to hold and replicate as much of the dataset as possible.

III. Global merkle structure

All data uploaded to Arweave is broken into 256 KiB chunks and organized in Merkle trees. Each transaction includes the Merkle root of the upload. These transactions are included in blocks, which are also Merkleized. The entire history of the chain is committed into a structure called the Block Index which is a top-level Merkle tree that allows any node to verify the existence and position of any byte in the weave.

IV. SPoRA mining

Instead of traditional Proof-of-Work, Arweave uses a Succinct Proofs of Random Access algorithm (SPoRA), a design where miners must prove they can access random parts of the dataset.

How SPoRA Works

SPoRA ties mining power directly to storage by challenging miners to prove they hold real data from the network.

Here’s how it works:

Challenge byte unlocking: Every second, each miner unlocks 2 challenge bytes for every 3.6 TB partition of the weave they store. The rate of unlocking is controlled by a cryptographic time mechanism called a Verifiable Delay Function (VDF).
Challenge byte generation: These 2 bytes are generated by hashing the VDF output and block metadata using the RandomX algorithm. One byte is selected from the miner’s current partition, and one is picked from anywhere in the weave.
Data scanning: The miner reads 2.5 MiB of data starting at each challenge byte. They scan that segment in 8 KiB steps, generating 320 hashes per challenge byte. The figure below shows a Merkle proof generated for a challenge byte to prove data inclusion and position.

Hash comparison: Each of those 320 hashes is checked against the network’s current difficulty. If any hash is higher than the target, the miner becomes eligible to build a block.
Block validation: Winning the block isn’t automatic. The proposed block must be accepted by the majority of peers in the network. If enough nodes adopt it into the main chain, the miner wins the round.

This process repeats every second and ensures miners are continuously proving access to stored data.

V. Measuring redundancy

Mining power on Arweave is directly tied to how much of the dataset a miner can access. The more of the weave you store, the more chances you have to win block rewards. If you only store a portion of the weave, your chances drop off quickly.

To calculate replication across the network:

Average Replication Rate = Network Size / Weave Size

As of July 2025:

Network size: ~117 PB
Weave size: ~349 TB
Result: ~335 copies of each byte, on average

Miners are exponentially penalized for not storing full replicas. This happens in two key ways:

Reduced access to challenge bytes: Every second, miners unlock two challenge bytes per 3.6 TB partition they store. The first challenge byte is always within the partition they hold. The second is drawn randomly from anywhere in the entire weave. If a miner only stores part of the weave, they will miss many second challenge bytes, losing those mining opportunities entirely.
Difficulty scaling: The second challenge byte has a much lower difficulty target. It is 100 times easier to generate a valid block from this byte than from the first. That means miners who store full replicas have a significant advantage.

While the protocol penalizes miners who skip chunks of the weave, it also rewards those who store diverse, unique data across many partitions. This chart from the Arweave Lightpaper illustrates how storing unique data leads to exponential performance gains compared to storing repeated or copied data. The more partitions of the weave you store uniquely, the higher your mining hashrate scales.

To stay competitive, some miners use pool software to collaborate and combine storage. Others operate coordinated mining clusters to ensure full weave coverage. Either way, the system strongly incentivizes storing as much of the dataset as possible.

VI. Mining strategies

Coordinated mining and pooling

Mining large portions of the dataset requires bandwidth, storage, and coordination. As the weave grows, miners have shifted from single-node operations to coordinated setups.

Coordinated mining allows multiple servers to mine using the same wallet address:

Regular nodes handle chunk storage and perform SPoRA challenges
One exit node submits winning block solutions

Only the exit node broadcasts blocks to avoid double-signing, which will slash the miner’s address. Miners also use pooling software to share access to the weave and increase their chances of finding the correct challenge byte. This makes mining accessible to smaller participants and increases overall replication. Learn more about coordinated mining here.

Sacrifice mining

Sacrifice mining is another strategy where a miner pays to upload data but does not seed it to the public network. They post a transaction paying for their data but then keep the data private and use it to gain an advantage in SPoRA challenges, since no other miner has access to that data.

In practice, sacrifice mining has proven unprofitable:

The cost to pay upload fees for enough sacrificed data to make a difference in hashrate is hight
Gains in mining rewards are marginal
Probability of the data being selected declines as the weave grows

Despite this, sacrifice mining is not harmful. The upload fees still support the endowment, and the data contributes to redundancy and economic security. Currently no miners are using this strategy, but it is interesting to follow if it ever returns in the future. Learn more about sacrifice mining here.

This thread from Sam also explains the sacrifice mining strategy in depth.

While there are multiple strategies for mining on Arweave and maintaining a secure, redundant network, there must also be economic incentives to preserve this data. That’s where the storage endowment model plays a critical role.

VII. Economic model that supports redundant data storage

Arweave storage endowment flow

Arweave ensures sustainability through its storage endowment.

When a user uploads data:

5 percent of the payment is paid to miners immediately.
90 percent is locked in the endowment and distributed slowly over time based on data availability.

This creates a self-adjusting incentive model:

If fewer replicas are online, miners receive more from the endowment.
If more than 20 replicas exist, rewards stabilize or decrease.

The endowment is designed to last for hundreds of years, supported by the assumption that storage costs will continue to fall over time. Even during the current emission phase, miners are already storing well above the minimum redundancy threshold.

VIII. Conclusion

As an everyday user, you might not think about the network stats behind data replication. But having your files spread across hundreds of nodes is key to long-term permanence. As the protocol matures, expect more innovations that cut costs, attract new nodes, and push the system toward greater decentralization.

If you want to go deeper, the Arweave Lightpaper is a great place to start. It breaks down the protocol’s architecture and how it all works under the hood. You can also see more network statistics here.

For more info on how to upload data to Arweave, checkout these tools:

Turbo - CLI tool
Ardrive - Dropbox like UI
Bazar Studio - UI to upload tradeable data
Load Network - EVM network that uses Arweave for storage

Permaweb Journal