Paper Note

Data Management in Microservices: State of the Practice, Challenges, and Research Directions

FAQ What is polyglot persistence principle? Polyglot Persistence is a term coined by Martin Fowler and Pramod Sadalage to describe the concept of using different data storage technologies to handle varying data storage needs within a given application. The principle behind polyglot persistence is that no single database can serve all needs of a modern application. Different kinds of data are best dealt with different types of databases. For example, you might use a relational database such as MySQL for transaction data, a NoSQL document store like MongoDB for semi-structured data, and a graph database like Neo4j for relationships between entities. Instead of trying to make one data storage technology work for all types of data, you use the type of database that is best suited for each particular need. This approach can result in a system that’s more scalable, performant, and easier to work with. Polyglot Persistence represents a shift away from the traditional, one-size-fits-all database strategy, allowing for a more flexible and optimized approach to data management. ...

Paper Note: Taking Omid to the Clouds: Fast, Scalable Transactions for Real-Time Cloud Analytics

Designing a TPS for clouds will meet these challenges: Diverse functionality. The concept of translytics as “a unified and integrated data platform that supports multi-workloads such as transactional, operational, and analytical simultaneously in realtime, … and ensures full transactional integrity and data consistency”. Scale Cloud-first data platforms are designed to scale well beyond the limits of a single application. Latency With the thrust into new interactive domains like social networks, messaging and algorithmic trading,latency becomes essential. Multi-tenancy Maintaining access rights is therefore an important design consideration for TPSs. 论文的工作集中于以下几点： ...

Paper Note: Omid, Reloaded: Scalable and Highly-Available Transaction Processing

An entirely re-designed version of Omid1. Omid’s job is to manage the transaction control plane. The transaction metadata includes: A dedicated table(Commit Tabla, CT) holds a single record per committing transaction. Per-row metadata for items accessed by transactions. Feature: A middleware oriented. Snapshot isolate should provide: Visibility check rules. Write conflicts detection. Visibility Check Rules Intuitively, with SI, a transaction’s reads all appear to occur the time when it begins($ts_r$) while its writes appear to execute when it commits($ts_c$). ...

Paper Note: Towards Transaction as a Service

Background Database systems have evolved to be with a cloud-native architecture,i.e., disaggregation of compute and storage architecture, which decouples the storage from the compute nodes, then connects the compute nodes to shared storage through a high-speed network. The principle of cloud-native architecture is decoupling. (decoupled functions to make good use of disaggregated resources) Most existing cloud-native databases couple TP either with storage layer or with execution layer. Coupling TP with Storage Layer TiDB adopts a distributed transactional key-value store TiKV as the storage layer. ...

Paper Note: Scalable Distributed Transactions across Heterogeneous Stores

FAQ What is the difference between rolling backward and rolling forward in database transactions? “Rolling backward” and “rolling forward” in the context of database transactions refer to two distinct phases of the recovery process that helps maintain the integrity and consistency of the database after a system crash or failure. These concepts are tied to the idea of transaction logs that record the changes made to the database. Below are the key differences between rolling backward and rolling forward: ...

Paper Note: GRIT: Consistent Distributed Transactions across Polyglot Microservices with Multiple Databases

FAQ What is a deterministic database? A deterministic database is a system where the outcomes of any database operations are guaranteed to be the same every time they are executed, provided that the operations are started from the same database state. This concept implies a level of reliability and predictability in the behavior of the database system. Deterministic behavior is essential in many contexts, especially in distributed databases, where operations might need to be coordinated across multiple nodes, or in any system where replication, fault tolerance, and consistency are important. If a database operation is deterministic, it means the following: ...

Paper Note: How to Read a Paper

The Three-Pass Approach Each pass accomplishes specific goals and builds upon the previous pass: The first pass gives you a general idea about the paper. The second pass lets you grasp the paper’s content, but not its details. The third pass helps you understand the paper in depth. The First Pass A quick scan to get a bird’s-eye view of the paper. This pass should take about five to ten minutes and consists of the following steps: ...

Paper Note: Ad Hoc Transactions in Web Applications: The Good, the Bad, and the Ugly

这篇文章讲道理不应该写这么长的，但讲的东西比较“好玩”，于是就多记录了一些。 FAQ What is an ad hoc transaction? It refers to database operations coordinated by application code. Specifically, developers might explicitly use locking primitives and validation procedures to implement concurrency control amid the application code to coordinate critical database operations. Why do not use database transactions instead? For flexibility or efficiency. 在一般的数据库使用场景下，伴随着数据库的隔离级别提升，性能下降十分严重，为此，应用层临时事务需要做到既利用低隔离级别的数据库防止性能下降，又要实现应用层的事务机制防止数据一致性错误等问题。 What is a false conflict in database systems? In database systems, a false conflict, also known as a phantom conflict, occurs when a transaction appears to conflict with other transactions, but in reality, it does not. This can happen in situations where transactions are using optimistic concurrency control or multi-version concurrency control mechanisms. ...