Dynamic Data Sharding

DataTie employs a dynamic data sharding approach to achieve decentralized storage for large dynamic datasets, as described in the whitepaper. This approach enables the storage of key-value (KV) pairs while providing nodes with the flexibility to choose which data to download and store, ensuring scalability and adaptability to changing storage needs.

In the DataTie network, the storage problem is divided into several sub-problems, including Proof of Publication, Proof of Storage, and Proof of Retrievability. Proof of Publication ensures that the data is initially shared on the network, giving nodes the option to download and store the data of interest or ignore it. Proof of Storage guarantees that the data is stored somewhere in the network, preventing any risk of data loss. Proof of Retrievability ensures that anyone can retrieve the data, even in the presence of malicious nodes withholding it.

To address the decentralized storage challenge for large dynamic datasets, DataTie implements dynamic data sharding. This involves partitioning the values of KV pairs into multiple fixed-size shards. Each node may host zero or multiple shards and claim rewards for each shard by providing proof of storage periodically. As the number of KV entries increases, new shards are dynamically created to accommodate the growing dataset.

The key to achieving decentralized storage on large dynamic datasets lies in building an on-chain oracle that estimates the number of replicas for each shard and rewarding nodes that prove the replication of the shard over time. Upon node launch, the operator can choose which shards to host based on the expected rewards, typically favoring shards with a lower number of replicas. This approach ensures the efficient distribution of data across the network while incentivizing replication for redundancy.

By utilizing dynamic data sharding, DataTie offers a scalable and flexible solution for decentralized storage of large dynamic datasets. This approach reduces the cost of storing large values compared to fully-replicated storage models, while ensuring the network's capacity can handle substantial amounts of data. With DataTie, developers can leverage the programmable key-value storage powered by decentralized storage, enabling long-term decentralized applications and unlocking new possibilities across various domains such as gaming, social networks, and AI.

DataTie's dynamic data sharding approach enables decentralized storage for large dynamic datasets by partitioning data into shards and incentivizing replication. This approach ensures flexibility and scalability, allowing nodes to choose which data to store while maintaining data integrity and availability.

Last updated