Let’s study a paper that deals with a pressing concern for application developers. How do we handle distributed transactions in internet scale applications? A key constraint: as an app developer, I want to focus on business logic and less on the infrastructure.
Helland, Pat. “Life beyond distributed transactions: an apostate’s opinion.” Queue 14, no. 5 (2016): 69-98.
Distributed transactions provide a facade of global serializability. Such techniques are often fragile and impractical. What are the alternatives?
Paper outlines a few patterns for apps that can scale almost infinitely. This is necessary since growth is imminent in large systems, both on the axis of usage and supporting data/compute infrastructure. Can we build abstractions that enable to us to reason over scale and the common issues while scaling an app?
There is an aspiration to drive consistency in our approach and understanding in building scalable systems.
- Divide the app in two layers. Lower layer is scale aware, it understands the implications of adding more machines. The higher layer is scale agnostic, it provides abstractions that leverage the elasticity in lower layers without worrying about intricacies of managing more compute or data hardware. E.g. map-reduce is a higher layer abstraction. Paper covers higher layer abstractions.
- Global serializability provided by 2PC and Paxos are not widely adopted in practical scenarios. They are often used within tightly closed group of machines in a cluster. Paper assumes that such global transactional capabilities are not available. Atomicity is strictly restricted to a single machine, or a closed cluster.
- Exactly once in order delivery of messages is not available. Message delivery is atleast once, i.e., communication is retried and can lead to duplicate or out of order messages.
- App consists of
Entities. Every entity is identified by a unique id and has associated data. Data is disjoint across entities and may be stored in a durable store.
- A transaction cannot span entities.
- Higher level abstractions are scale-agnostic, i.e., they don’t worry about the partitioning used by lower layer abstractions. Message destination is the entity key.
- Both parties in the message delivery are entities (sender and receiver). Since delivery guarantee is at-least-once, the receiver must expect duplicate messages and know how to deal with them. Thus, each entity must manage a per-sender state. Paper uses the term activity for this.
- Activities manage the negotiation between two entities to reach a loosely
coupled agreement. This requires every entity to have knowledge of the other
- E.g. consider two apps: ordering and shipping. Ordering app creates an order and creates an event with shipping-id/order-id. Shipping app subscribes to this and performs its business logic. An activity encapsulates all of these.
Entity represents a single scope of transaction. Unique identifier used by an entity may be used by the lower scale-aware layer for partitioning, thus storing multiple entities in separate machines (hence no transaction spanning entities).
The identifier for an entity is the Key to locate the related data. It must be unique through the application scope.
Upper layer (scale agnostic) cannot make any assumptions on where the entity is physically stored. This allows the lower (scale aware) layer to repartition the data store to accommodate scale. E.g. split data into partitions and place them in separate machines to reduce bottlenecks.
The understanding of the existence of the entity as a programmatic abstraction, the use of the entity-key, and the clear commitment to assuming a lack of atomicity across entities are essential to providing a scale-agnostic upper layer to the application.
While we have put a constraint that an entity is addressed by a unique Key, practically, most applications are accustomed to addressing the same domain entity e.g., Customer, in multiple ways. We use the customer id, a SSN, or an email etc. A common way to solve this is by creating alternate indexes linking the various lookup filters.
Alternate indexes may not be present in a single scope of serializability. Indexes are created on the entity data store. However, they must be partitioned to achieve the same goal of infinite scale. Application cannot assume them to be located together.
Another interesting takeway is that entity and index cannot be updated atomically. There will always be a lag, and the index will be out of sync with the entity for a time window.
Keeping data in sync happens through workflow-style updates via async messaging. These are responsibilities we have taken for granted in lower layers (data store), slowly start percolating up to the application. All for the love of scale!
Entities and Objects
Objects (in the OOP sense) can be entities if their encapsulated data is strictly disjoint from other data. Second, there is no notion of updating this disjoint data along with other data. If we take away the relationships aspect from Objects, we are left with the beloved pure data objects, some of us call them POCO or POJO depending on our allegiance.
Paper also calls out the hard truth about the Objects encapsulating database data. These (ORMs?) have ambiguous encapsulation which magically update the related data. They try to make transaction scopes to span objects. Both of these fall apart as the system scales.
Objects participating in a transaction scope are forced to be collocated. The only way available for us is to use a common entity key. Thus, we end up joining the objects to form an entity.
Any updates targeting more than one entities require coordination via messaging. By definition, messages originate from one entity and are targeted to another. A message cannot be sent in between a transaction, because the transaction at sender may result in abort, and the message at receiver could have triggered another transaction. This could leave the system in inconsistent state.
A message from sender can be processed at receiver only after the sending transaction. Paper terms a message to be asynchronous with the sending transaction. Messages cause transactions.
Sender transaction only has the target entity key, not the location.
Location of entities change over time (repartitioning) to allow the scale-aware lower layer to achieve infinite scale. Such movement of state across the machines leads to duplicate messages, e.g. sent to the older location, and newer location. There is no ordering guarantee of messages.
For these reasons, we see scale-agnostic applications are evolving to support idempotent processing of all application-visible messaging. This implies reordering in message delivery, too.
As an aside, the asynchronous relationship between the sender transaction and messages reminds us of the microservices patterns: event sourcing and event streaming.
How do we deal with message retries and reordering?
Retries and Idempotence
Let the Application handle duplicate messages with idempotence. Paper defines idempotence as lack of substantive change. Substantive change is a side effect. E.g., reads are naturally idempotent since they are side effect free. But, a vast majority of messages may have side effects. Idempotence will require state on the entity, i.e., remembering the last processing of a message and ignoring it for the subsequent delivery.
State includes the processing history and last response. Last response is returned for duplicate invocations.
Scale aware lower layers can build support for duplicate messages. Keeping state of messages along with the entity will be a constraint on the lower layer in this approach. Further, such state needs to be carried along with repartition. However, paper doesn’t expound on this further.
Every entity must capture the state (message history and response) on a per partner basis. A partner is a dependent entity. This state is called an Activity.
Activities are simple. They are what an entity remembers about other entities it works with. This includes knowledge about received messages.
E.g., consider an eCommerce scenario with domain entities like Order and Item. Each Item is an independent entity with details about inventories etc. Order also comprises of included Items. Per inventory Item data included in an Order is called Activity. Transactions can never span Order and Item.
Activities link entities together through their identifiers, or Keys. Note that physically the entities may be located differently. Local information stored in the entity about other entities is an Activity.
How does such a system work without distributed transactions?
The absence of distributed transactions means we must accept uncertainty as we attempt to come to decisions across different entities.
Application logic must manage the uncertainty using workflows. E.g., domain contracts, timeout/retry policies etc. are some existing mechanisms for the business domain to deal with unknowns.
Moreover, such uncertainty must be tracked in the activity i.e. relationship between two entities, using existing and new states for the partner (entity) as an hint. Basically an entity accepts unexpected states. E.g., the ordering system may reserve items from inventory; the inventory doesn’t really know if the reserved items may be ever used. Managing uncertainty here is being aware that a divergence can happen, the order could get cancelled. Paper calls these as tentative operations. These are a negotiation between entities to converge.
Tentative Operations have two final states: cancelled or confirmed. When the originating entity has the right to cancel, the operation is called a cancelling operation. Similarly, when the right to cancel is given up, the operation is called a confirming operation.
Every transaction is a consensus problem with a decision to commit or otherwise. How does the entities and activities help reach an agreement?
Through a set of two-party agreements. Each of these agreements compose together to get large set of parties to agree without transactions. Paper cites the housing deals as an example, where two parties enter independent agreements with an escrow which faciliates an agreement between the two parties.
These two party agreements are as simple as managing states of partners in an entity (through activities).
2PC and other distributed transaction primitives impact the availability. Management of uncertainty is a valuable alternative where the developer owns the logic in scale agnostic application layer.
Historically, the evolution of patterns is always preceded by custom solutions to similar set of problems. Eventually we figure out the commonalities and build an abstraction that encapsulates the repeated aspects and lets developers focus on the business problems. This paper emerged at such a point in time where infinite scaling was a concern for most application developers, trade-offs of 2PC were not palatable, yet we didn’t have a pattern language/vocabulary to talk about the solutions. Paper presents entities and activities to express these ideas.