Paper Note: Epoxy: ACID Transactions Across Diverse Data Stores

Summary 一句话总结,就是:Re-implement the multi-version concurrency control mechanism of Postgres on shim layers. 因为这篇文章在组会上做了汇报,所以我就直接贴 PPT 了。 Content

November 22, 2023 · 1 min · KKKZOZ

Paper Note: Zab: High-performance broadcast for primary-backup systems

FAQ What is the difference between receive and deliver? What does it mean by saying “Zab’s transaction log doubles as the database write-ahead transaction log” in page 3? ZooKeeper uses an in-memory database and stores transaction logs (Write-ahead log) and periodic snapshots on disk. Before a transaction is executed and its changes are applied to the in-memory database, it is first logged. This means that if the system crashes before the changes can be applied, the transaction can be replayed from the log to ensure data integrity. ...

November 15, 2023 · 4 min · KKKZOZ

DDIA: Chapter 9 Consistency and Consensus

本章是这本书最酣畅淋漓的一章,涉及到了一致性和共识问题的方方面面,知识点多而不失条理。第一部分先讲了 Linearizability, 为后面的知识点做铺垫。到了 “Ordering Guarantees” 这一小节,从因果关系的带来的 “Happened Before” 的关系开始讲起,讲到了序列号和 Lamport Timestamp,提出来 Lamport Timestamp 的一个缺点:无法在某事件发生时判断是否有冲突,然后引出了全序关系广播,在全序关系广播中又讲到了和 Linearizable 之间的等价关系,最后引出共识算法。太精彩了,值得反复阅读! Consistency Guarantees Most replicated databases provide at least eventual consistency, which means that if you stop writing to the database and wait for some unspecified length of time, then eventually all read requests will return the same value. A better name for eventual consistency may be convergence, as we expect all replicas to eventually converge to the same value. ...

November 13, 2023 · 28 min · KKKZOZ

Paper Note: CAP Twelve Years Later: How the "Rules" have Changed

FAQ what is a version vector? A version vector is a construct used in distributed systems to track the version of data across different nodes in a network, ensuring consistency and helping to resolve conflicts. Version vectors are particularly useful in systems where multiple nodes may independently modify data and then need to synchronize with each other without relying on a central authority. This concept is fundamental in the context of eventual consistency and conflict resolution in distributed databases, file systems, and data replication scenarios. ...

November 12, 2023 · 8 min · KKKZOZ

Research: 6.824 Lab2B 中异常情况的分析

写这篇文章的原因是之前在测试 6.824 Lab2B 时总是会出现几个错误,去提了 issue 后也没有得到令人信服的结果,自己有一点头绪但是没验证,这事就这么放着了。 然后最近有个同样做 6.824 的同学给我发了邮件,说他也遇到了同样的问题,重新分析了一下后,本来想简单回复一下的,结果回复的内容越写越多,就干脆直接整理为一篇文章,供大家参考。 异常情况 我之前在对 Lab2B 进行测试时,总是有几个简单的测试点过不了,仿佛代码即使正确,也总是可能出错。我经过分析后发现,都遵循以下这种错误模式: Leader 接收了来自上层的请求,还未提交该日志或者只有他提交了日志(该日志已经被 major 收到)时,就因为收到了其他 peer 的 RequestVote RPC 重新变回了 Follower。 重新选举后再次成为 Leader 后,由于旧任期的 Log 不能被新任期的 Leader 提交,所以之前的日志无法提交。 没有新的请求进来,导致该日志一致无法提交,然后 2 秒后超时,测试无法通过。 错误提示都是 one(xxx) failed to reach agreement。 为什么会出现 在 Lab2B 最开始的几个测试中,测试的编写者为了简化测试,测试代码中提交 command 的操作均为 cfg.one(cmd, servers, false),这个函数的第三个参数名为 retry,控制的是对于一个请求,是否需要在超时后重新提交。 这里 retry 被设置为了 false,也就是说整个执行过程中只会调用 rf.Start() 一次,如果遇见了上文说的异常情况,就会被卡住,最后出现超时报错的情况。 也就是说,所谓的异常情况就是恰好遇见了一个 timing 加上 Lab2B 前面的几个测试有“缺陷”造成的。 no-op 机制 Raft 协议中本身是没有这个问题的,在论文第 13 页中说明了一个节点在当选 Leader 后会发送一个 no-op 的日志,这样新 Leader 就能把 no-op 以及它之前未提交的日志一起提交,就不会卡住了。 ...

November 8, 2023 · 5 min · KKKZOZ