Decentralized Storage: Blockchain Storage Economics

Published on: 24 April, 2021

Security and integrity of data is a central pain point for institutions under the current market obligations. Initially, users were storing data in cold storage units like a hard drive, USB drives or compact disks but with the advent of cloud storage protocol, things has changed a lot. A swift incentive structure and a very simple supply-demand model have incentivized the growth of a multi-billion dollar industry of data storage.

Centralized cloud storage platforms are storing data for governments, corporate institutions, intelligence agencies, etc. The server space is leased out in the truncated lease categories by organizations like DropBox, which lease servers from the likes of AWS and provide storage space for end-users. This method of cloud storage requires the users to place their trust in central data hubs leaving room for malfeasances like censorship, leaks, etc.

The Bitcoin whitepaper by Satoshi Nakamoto proposed the idea of having a decentralized ledger being stored simultaneously throughout nodes dispersed across the internet. The idea of decentralized storage of data is an extrapolation of this idea. Instead of storing transaction records, the blocks can contain other forms of content. Data storage in most user devices is seldom at capacity. In most cases, half of the storage space is left empty. Blockchain technology can be leveraged to use this space as a decentralized storage system.

What is Blockchain Storage Economics?

Data stored in the blockchain can be divided into two parts. First, the historical transaction record, a public ledger that is stored in every node of the blockchain. It is stored in hash tables using the key-value database paradigm. Second, state data, refer to the amalgamation of the balance sheet, the smart contract code on which the decentralized application runs and the data stored on these dApps. The current state is stored in data structures of digital trees.

When a user generates a new transaction, they have to allocate a transaction fee to incentivize the miners to generate a new block. Miners compete to mine the new block and the winner gets to mine it and get the transactional data accrued by the user. In addition to that, they also receive a token as a mining bonus. This future block has to download all the historical transactional data from a peer and thus as the blockchain technology grows in market capitalization, solving the hash function requires more processing power.

A commercially viable blockchain would require transaction fees to be low enough to make transactions in the network and the miners need to be incentivized enough with bonuses of token distribution to mine for new blocks. Maintaining these factors, storing data in current technology is economically unfeasible. Processing units with updated processing power are being introduced regularly and operating costs are decreasing regularly. In the near future, decentralized storage will become technologically feasible data.

Different Types of Storage Costs

Using dApps means deploying smart contracts and producing new blocks. The user initiates the transaction and the data is sent to the smart contracts identified by cryptographic encryption. The pieces of code deployed incur a transactional fee on the user end. The blockchain protocol Ethereum deploys smart contracts in the environment of the Ethereum Virtual Machine (EVM). Individual lines of code are called EVM opcodes. Deploying an opcode requires a transactional fee which is charged in the currency of gas. This is the explicit cost of decentralized storage. The gas limit is set by the user as the maximum amount one is willing to spend as a transaction fee. Infrastructure cost for storage space on a new node in the blockchain has to be offset by the transactional fee.

The compensation mechanism for mining new blocks is incurred through applying transactional cost on the user end of the contract. This is mainly charged for storing and manipulating the state data rather than storing the historical records. A node has to have a record of all historical transactions to complete even the basic transaction. Thus the size of the blockchain exponentially increases and storing the current state data in the RAM becomes an IO latency problem. For example, as the chain grows in accounts and starts deploying more complex contracts, the state data grows in size. The end-user can allocate only a certain percentage of the RAM for storing the state data. Ratifying a new transaction requires all of the state data to be there, for a blockchain protocol like Ethereum. Most of the data is still in disk storage and uploading takes up lots of processing power and time. This is the implicit cost in data storage in decentralized systems.

The Need for Decentralized Storage and Communications

Data storage in today’s market structure is centralized in nature. The client-server architecture of the modern-day internet application makes centralized data storage a very lucrative business. Companies front end a server space and then rent it out to prospective internet applications. This centralized structure has a lot of drawbacks for the end-user. Primarily, the ability of the server host to control the fate of the data stored. They can decide to censor the data and the user will not have any possible recourse. Maintaining the security of digital identities is important in today’s data capitalization era. However, being stored in central servers leaves room for hacks and subsequent misuse of private information.

Another problem with data centralization is scaling and latency. Suppose someone has popular content uploaded where a lot of traffic is being generated. Uploading the content on a single server node will not suffice in this situation. The physical proximity of the server to the end-user is required to alleviate any IO latency issues. Hosting content on multiple servers all over the world creates a bottleneck for mid-level content creators. Hosting content simultaneously in different geographic locations requires a lot of money and only a handful of projects have the capital to set up this system.

Decentralized storage applications can be the solution to a lot of these issues. A trustless, peer-to-peer transfer of resources solves all these issues. In a decentralized system, there is no central authority ratifying the transaction, thereby removing the single point of attack causing censorship, de-platforming and espionage problems.

RIF’s Approach to Data Storage: RIF Storage & Pinning

RIF storage is a unified protocol for a decentralized storage solution. It is developed on the RSK smart contract protocol built on top of the Bitcoin blockchain. The goal is to answer the problems of a centralized system. It needs to be censorship-resistant and have the capability of peer-to-peer permissionless transfer. Maintaining anonymity is also very important in a decentralized trustless transfer protocol. Nodes in the network are bound to fail sometimes, but the sanctity of the data will be maintained nonetheless. Lastly, the economics of the situation has to be profitable in a market setup. Uploading data needs to be easy and its storing needs to be cost-effective and profitable for the small home user. RIF storage protocol is built to be the replacement for centralized storage solutions.

The Architecture of RIF Storage

Constructing a protocol that brings together the demand and supply for storage space of data in a secured and trustless system is the guiding principle behind the architecture of RIF storage. Users who want to provide for the storage have to register their offer, their payment structure and the system of the storage. The customer will browse through these storage plans and choose the one that is right for them. This will be run through a smart contract layer and thereby will make this transaction an untraceable and decentralized process. The provider will present an offer and the consumer will get into an agreement based on all the contract requirements. The offer provided will have parameters about payment plans and at the end of any payment period, the provider of the storage space may renege the space.

The storage protocol can be broadly divided into the following sub-layers:

1 – The outer layer or the user side of the protocol

Consists of services using the RIF storage protocol and libraries. The main aim of the protocol is to build infrastructure layers to ease the process of accessing RIF storage. The user faced layer mostly have tertiary 3rd party solutions. Nonetheless, the developers have built three services that will be essential for the efficacy of the service. First, the storage gateway will allow users to interact with the RIF storage without hosting nodes, at a small price. Second, the pinning service allows the persistence of data in the system even when the initial node storing the data goes offline. Third, the node manager will allow node operators to maintain and check on their storage in a seamless manner.

2 – API layer or the layer facing the developers

There are multiple storage providers and RIF storage coalesces with them seamlessly. The API layer is for the developers and they can seamlessly use any of the storage providers and even switch between them along the way. With the RIF storage JavaScript library, it can be integrated easily into any dApp or Node.js project.

3 – Storage layer

Storage providers like Swarm and IPFS are integrated with RIF storage so that clients can store data anywhere and even shift it around accordingly. The original code of these platforms is being strengthened and the rapid integration with RIF storage is making them more robust and hackproof.

RIF has also recently introduced RIF Pinning which enables a user to pay for the service of other computers on IPFS to pin their files. It also allows providers of IPFS pinning services to list their services, and set their price per GB per month. Simply put, a file hosting solution on decentralized tech.

RIF Storage Providers

Swarm

The primary objective of developing Swarm was to provide a storage layer for Ethereum records. Built as a storage and distribution layer for content on the Ethereum blockchain, it allows participants in the chain to pool together their network and storage resources and use it as a secure network for storage and communication within every node present on-chain. From the perspective of the user, it is not very different from the internet of the present day. It is a peer-to-peer storage service that is a censorship-resistant, DDoS attack-resistant version sub-layer of the present internet protocol. Trading resources in this network is seamless as it tracks and maintains all the transaction records in a P2P way. Uploading a file on the Swarm network necessarily makes it delete proof. Even after the originating node has gone offline, the sanctity of the data is maintained. Sensitive data needs to be thoroughly encrypted before uploading them on a Swarm network.

IPFS

The InterPlanetary File System(IPFS) is a decentralized file storage and networking protocol. Using content addressing techniques, IPFS can locate any file present in the distributed nodes. It works in a very similar system as BitTorrent. Once the file has been uploaded it can be accessed from any node connected to the network. As the number of downloads increase, the file is replicated in the various nodes. Similar to a Torrent file-sharing system the download speed of a file is directly proportional to the number of nodes seeding it. The file is available for download from the network as long as the nodes containing the file are online. Sustaining nodes in the network is done in a tit-for-tat system. The node is allowed to operate on the network as long as it positively impacts the efficacy of the network. Filecoin is a decentralized storage project that is based on IPFS file-sharing protocol. It had raised funds upwards of $250 million through its ICO.

Summary

Ensuring data sanctity is becoming a very important issue in the post-Covid-19 pandemic age of 2021. Centralized data storage solutions are economically profitable for the server host but securing the data stored in them is not possible. Data hackers can essentially hack into even the most secure servers in the world as we have seen many times. The only possible solution to this problem is storing the data in a decentralized network of storage spaces and RIF is leading the way.