Merkle trees
Last updated
Last updated
Repairing stale data for nodes that were down and now are back online but the nodes are far behind and a read repair would take too long.
Naively splitting up the entire range to calculate checksums for comparison, is not very feasible.
We can use merkle trees = binary tree of hashes, where each internal node is the hash of its two children, and each leaf node is a hash of a portion of the original data
We can use merkle trees by:
comparing the root hashes of both trees.
if they are equal, stop.
recursively checking the left and right children.
This way the amount of data exchanged is minimized. The principal advantage of a Merkle tree is that each branch of the tree can be checked independently without requiring nodes to download the entire tree or the entire data set.