Logging
- what:recording events as they occur.
- why: need to use log to restore
- how: combine with checkpoint, periodically checkpoint to save complete state. increment log to maintain updates from that state.
- synchronous logging: spend so many time to recovery
- asynchronous: log buffered in memory, reduce overhead, but leave log entries vulnerable to loss.
- GDV, global dependency vectors
- interval is how many messages I got since beginning
- if someone vector is bigger than the original one, it means it is inconsistence
- problem: we can use prune log to reduce playback time
- it may resend messages that having negative effects on system. Use incarnation numbers to enable recipients to do the same.
- expensive in times
checkpoint
- what: used to restore state.
- why:for global consistent copy.
- how: set point, freezing the system,no write to maintain consistency. inbound messages will not frozen but write into buffer.
- uncoordinated checkpoint:
- periodically recorded its state
- need recovery line to recover
- coordinated checkpoints:
- record message sequences: know who sent us message since last checkpoint
- synchronized clocks, add timestamp in all messages and act as a sequence number;:
- uncoordinated checkpoint:
- problem: freeze
- incarnation number: restarted will +1, use this to distinguish whether I should accept this message.
Children