Development of Blockchain-Based Content Caching System
The typical ask sounds like: "we want to store content decentrally, but it needs to load fast." This contradiction is in the request itself. IPFS is slow, Arweave is slow, Filecoin is slow. Blockchain is not designed for storing large data volumes. Blockchain-based content caching system is not "store content in blockchain," it is "use blockchain to manage distributed cache" where content itself is in decentralized storage, and access rights, cache state, and caching economics are on-chain.
Architectural Model: What Stores Where
Correct separation of concerns:
Content (files, video, data) → IPFS / Arweave / Filecoin
Metadata and CID → On-chain or calldata
Access Rights and DRM → Smart contracts
Caching Economics → Token incentives (on-chain)
CDN / edge cache → Traditional infrastructure or decentralized (Fleek, Spheron)
Blockchain is not a database for content. Storing 1MB on-chain on Ethereum mainnet costs $3000+ (at 30 gwei gas, 16 gas/byte for calldata). EIP-4844 blobs are cheaper—$0.01 per 128KB—but data is available only ~18 days.
Content Registry: Management Contract
Central element—content registry tracking what stores where and who has rights:
contract ContentRegistry {
struct ContentItem {
bytes32 contentId; // keccak256(originalUrl or uuid)
string ipfsCid; // IPFS CID (CIDv1 base32)
string arweaveTxId; // optional: Arweave for permanent storage
address publisher;
uint256 publishedAt;
uint256 size; // in bytes
ContentType contentType;
AccessModel accessModel;
bool active;
}
enum ContentType { Image, Video, Document, Data, Code }
enum AccessModel { Public, TokenGated, Subscription, PaidPerView }
mapping(bytes32 => ContentItem) public content;
mapping(bytes32 => mapping(address => bool)) public accessGrants;
event ContentPublished(bytes32 indexed contentId, string ipfsCid, address publisher);
event ContentAccessed(bytes32 indexed contentId, address user, uint256 timestamp);
function publishContent(
bytes32 contentId,
string calldata ipfsCid,
string calldata arweaveTxId,
uint256 size,
ContentType contentType,
AccessModel accessModel
) external {
require(content[contentId].publisher == address(0), "Already exists");
content[contentId] = ContentItem({
contentId: contentId,
ipfsCid: ipfsCid,
arweaveTxId: arweaveTxId,
publisher: msg.sender,
publishedAt: block.timestamp,
size: size,
contentType: contentType,
accessModel: accessModel,
active: true
});
emit ContentPublished(contentId, ipfsCid, msg.sender);
}
}
Token-Gated Access
ERC-721 or ERC-1155 as passes to content—standard practice for NFT-gated content:
interface IAccessController {
function hasAccess(bytes32 contentId, address user) external view returns (bool);
}
contract NFTGatedAccess is IAccessController {
ContentRegistry public registry;
mapping(bytes32 => address) public contentGates; // contentId => NFT contract
mapping(bytes32 => uint256) public requiredTokenId; // 0 = any token from collection
function hasAccess(bytes32 contentId, address user) external view override returns (bool) {
ContentRegistry.ContentItem memory item = registry.content(contentId);
if (item.accessModel == ContentRegistry.AccessModel.Public) return true;
address gateContract = contentGates[contentId];
if (gateContract == address(0)) return item.publisher == user;
IERC721 nft = IERC721(gateContract);
uint256 tokenId = requiredTokenId[contentId];
if (tokenId == 0) {
return nft.balanceOf(user) > 0;
} else {
return nft.ownerOf(tokenId) == user;
}
}
}
Decentralized CDN with Token Incentives
Idea: cache nodes (cache nodes) receive rewards for storing and serving content. This is Filecoin's model, but for hot cache (fast access), not cold storage.
Cache Node Registry
contract CacheNetwork {
struct CacheNode {
address operator;
string endpoint; // URL of node API
uint256 stake; // Stake for participation
uint256 bandwidthServed; // Bytes served by node
uint256 reputationScore;
bool active;
}
struct CacheJob {
bytes32 contentId;
address requester;
uint256 rewardPerGB;
uint256 duration; // seconds
uint256 deadline;
}
mapping(address => CacheNode) public nodes;
mapping(bytes32 => CacheJob) public jobs;
uint256 public constant MIN_STAKE = 0.1 ether;
function registerNode(string calldata endpoint) external payable {
require(msg.value >= MIN_STAKE, "Insufficient stake");
nodes[msg.sender] = CacheNode({
operator: msg.sender,
endpoint: endpoint,
stake: msg.value,
bandwidthServed: 0,
reputationScore: 100,
active: true
});
}
function postCacheJob(
bytes32 contentId,
uint256 rewardPerGB,
uint256 duration
) external payable {
// Escrow for cache node payments
jobs[contentId] = CacheJob({
contentId: contentId,
requester: msg.sender,
rewardPerGB: rewardPerGB,
duration: duration,
deadline: block.timestamp + duration
});
}
}
Proof of Bandwidth: How to Prove Node Served Content
Main challenge: verify off-chain work on-chain. Several approaches:
Challenge-response (optimistic): node claims delivered X GB. Challenger can dispute, requesting proof. Node must provide log with signed user requests (merkle tree of request logs). If cannot—slashing.
Signed receipts: each user upon receiving file signs receipt (timestamp + contentHash + userAddress). Node accumulates receipts, periodically publishes merkle root on-chain. Allows fairly attributing bandwidth.
// Off-chain logic of cache node
interface ServingReceipt {
contentId: string;
userAddress: string;
bytesServed: number;
timestamp: number;
userSignature: string; // user signature
}
async function claimBandwidthReward(receipts: ServingReceipt[]) {
// Assemble merkle tree from receipts
const leaves = receipts.map(r =>
ethers.keccak256(ethers.AbiCoder.defaultAbiCoder().encode(
["bytes32", "address", "uint256", "uint256"],
[r.contentId, r.userAddress, r.bytesServed, r.timestamp]
))
);
const tree = new MerkleTree(leaves, keccak256, { sortPairs: true });
const root = tree.getHexRoot();
// Publish root on-chain
await cacheContract.submitBandwidthClaim(root, totalBytesServed, receipts.length);
}
IPFS Pinning and Content Availability
IPFS without pinning service is unreliable. Content is deleted from local cache of nodes after garbage collection. For production system explicit pinning is needed.
Decentralized Pinning via Smart Contract
contract PinningMarket {
struct PinRequest {
string cid;
address requester;
uint256 payment; // total payment in escrow
uint256 replicationFactor; // how many nodes should store
uint256 duration;
uint256 activeUntil;
address[] pinners; // who is pinning
}
mapping(bytes32 => PinRequest) public requests;
function requestPin(
string calldata cid,
uint256 replicationFactor,
uint256 duration
) external payable {
bytes32 requestId = keccak256(abi.encodePacked(cid, msg.sender, block.timestamp));
requests[requestId] = PinRequest({
cid: cid,
requester: msg.sender,
payment: msg.value,
replicationFactor: replicationFactor,
duration: duration,
activeUntil: block.timestamp + duration,
pinners: new address[](0)
});
}
function acceptPin(bytes32 requestId) external {
PinRequest storage req = requests[requestId];
require(req.pinners.length < req.replicationFactor, "Fully replicated");
// Node takes pinning task
req.pinners.push(msg.sender);
}
// Periodically node proves file is available via Proof of Storage
function submitStorageProof(bytes32 requestId, bytes calldata proof) external {
// Verify via verifiable delay function or challenge-response
_verifyStorageProof(requestId, proof);
// Unlock part of payment
_releasePartialPayment(requestId, msg.sender);
}
}
Ready solutions: Filecoin/Estuary for long-term storage, web3.storage (Storacha), Pinata or NFT.Storage with API. Custom implementation justified if you need specific control over economics.
Content Addressing and Deduplication
IPFS CIDv1 is content-addressed: identical data gives identical CID. Automatic storage-level deduplication. But need proper chunking:
import { create } from "ipfs-http-client";
const ipfs = create({ url: "https://ipfs.infura.io:5001" });
async function uploadWithChunking(data: Buffer): Promise<string> {
const result = await ipfs.add(data, {
chunker: "rabin-262144-524288-1048576", // rabin chunking for better dedup
cidVersion: 1,
hashAlg: "sha2-256",
});
return result.cid.toString();
}
Rabin chunking splits files at content-defined boundaries—one change in file changes only small chunk portion, not entire file. Important for large files with incremental updates.
Performance: Hybrid Architecture
For real application need fast access layer over decentralized storage:
User
↓
Edge CDN (Cloudflare / Akamai) — hot cache, <100ms
↓ cache miss
IPFS Gateway cluster (own nodes) — warm cache, <1s
↓ cache miss
IPFS Network / Arweave — cold storage, 2-30s
On-chain component: user requests access → smart contract verifies rights → issues signed URL or access token → client goes to CDN with token.
Development Stack
| Component | Technology |
|---|---|
| Content storage | IPFS (Kubo) + Arweave for perma |
| Pinning | web3.storage API or Estuary |
| Registry contract | Solidity + Foundry |
| Access control | ERC-721 gating + Lit Protocol for encryption |
| Edge cache | Cloudflare Workers + R2 |
| Bandwidth proof | Merkle receipts + optimistic verification |
| Node SDK | TypeScript + helia (new IPFS JS) |
When This Makes Sense
System justified if:
- Need censorship resistance (content cannot be deleted by centralized solution)
- Rights holders want on-chain verifiable rights and automatic royalties
- Need transparent economics for cache node operators
If task is just "serve files fast"—Cloudflare + S3 suffices.
Timeline
MVP with registry, IPFS storage and token-gated access—4–6 weeks. Full system with incentivized cache nodes, bandwidth proofs, governance—3–4 months.







