Segmented Log
Motivation
A single log can become difficult to manage:
it can become a performance bottleneck, especially when it is read at the startup
older logs need to be cleaned up periodically or, in some cases, merged
doing these operations on a single large file is difficult to implement.
Solution
Break down the log into smaller segments for easier management, such that the log data is divided into equal-sized log segments. The system can roll the log based on a rolling policy (by time e.g. every 4 hours or size every 1GB)
Applications
Cassandra uses the segmented log strategy to split its commit log into multiple smaller files instead of a single large file for easier operations. As we know, when a node receives a write operation, it immediately writes the data to a commit log. As the Commit Log grows in size and reaches its threshold in size, a new commit log is created. Commit log segments are truncated when Cassandra has flushed corresponding data to SSTables. A commit log segment can be archived, deleted, or recycled once all its data has been flushed to SSTables.
Kafka uses log segmentation to implement storage for its partitions.
Last updated
Was this helpful?