On the road to reliable data: What is ACID in the context of databases?

Microsoft SQL
Oracle
17/10/2024
Tomasz Chwasewicz
Table of contents

Indivisibility, consistency, isolation and durability are the pillars that protect database transactions from falling into chaos. These properties form the basis of reliable database systems, ensuring that every operation complies with the rules. In this way, we can handle any failure with precision and not leave any transaction incomplete or in a damaged state. If it weren't for ACID, the data world would be a mess of half-done actions, inconsistent records, and data loss that could bring even the simplest system to its knees.

Below, we will discuss each of the properties that make up this sour-sounding acronym, while explaining their role in maintaining order in the data world. From the indivisibility of atomicity to ensuring durability, these principles work together. They do this to make sure that every transaction is executed correctly, even in the face of system failures or unexpected errors. Along the way, we will also address the issue of how ACID contrasts with more relaxed models, such as Eventual Consistency, in which the trade-off between tight consistency and high availability shapes performance.

Atomicity or indivisibility

Let's start with the letter “A”. The English term “Atomicity” referring to database transactions comes from the concept of “atom” in physics. It implies indivisibility, as it allows easy visualization of the enormous consequences that fragmentation of transactions can bring. In DBMS (eng. Database Management Systems (Database Management Systems) - atomicity guarantees that within a given transaction all parts must be successful. Otherwise, it will look as if none of them ever happened.

An example comes to mind here, a simple financial transaction. Let's say that PLN 100 goes from account A to account B. This transaction consists of two operations: debiting account A and allocating this amount to account B. The indivisibility guarantees that both of these actions will be executed completely or not at all. If, say, dark clouds gather over the world of technology and the credit of account B fails after account A has already been debited, indivisibility will undo the entire operation. 100 zł is returned to account A as if nothing had ever happened — balance restored, financial disaster averted.

How to achieve indivisibility?

  • Transaction logs record any change that a transaction makes to the database before the changes are actually implemented. If a transaction fails, the system uses these logs to undo everything, keeping the database clean of any effects of half-baked transactions.
  • Two-Phase Approval Protocol (2PC) consists of two phases. In the “preparatory phase”, each node involved in the transaction agrees to approve or abandon it. If each node raises its thumb up, they move into the “approval phase”. There, changes are made locally in each node and their successful completion is confirmed. This method ensures that every part of the transaction will be fully approved or withdrawn, without any exceptions.
  • Locking mechanisms prevent other transactions from accessing the data affected by the current transaction. By blocking data for exclusive access during transaction execution, the system ensures that no other transaction can interfere, thus maintaining atomicity. Locks are typically maintained until a transaction is approved or withdrawn, ensuring data consistency.
  • Registering in advance is a method in which changes are first recorded in the log, even before they go to the database. This ensures that the database can always return to a consistent state in the event of a failure thanks to the logs. We can use them to perform transaction operations again from the beginning or to withdraw them completely.
  • Shadow Paging, unlike logging in, involves maintaining an additional copy of the database pages modified during a transaction. This copy, the “Shadow Page”, is updated with changes as the transaction progresses. Only after successful transaction approval do “Shadow Pages” replace the original pages in the database. If the transaction fails, the system rejects these pages, ensuring that the database does not reflect any partial updates.

Consistency

Consistency in database systems ensures that all transactions reliably follow the rules and constraints of the database, maintaining a stable and error-free environment. When a transaction is triggered, it has a responsibility to ensure that every byte and bit it touches complies with the established standards. Thus, it “protects” the correctness and reliability of the database. If the transaction dares to violate any of the rules, it immediately becomes the “culprit”, and the system quickly returns to the calm state before this incident.

For example: a simple bank transaction in which funds are transferred from one account to another. A consistent transaction would mean that the total balance before and after the transaction remains the same. The transaction must subtract the funds here, and add them there, but the total is identical before and after the transaction. This reflects one of the simplest forms of integrity constraints — balance preservation.

How is consistency achieved?

  • Constraints are rules defined in a database schema that specify allowed values for specific fields. They may include:

Uniqueness limits to prevent duplicates.

Conditional constraints (CHECK) to validate data according to a specified condition.

Foreign keys that ensure that relationships between tables are maintained.

The database management system (DBMS) automatically checks these restrictions during the transaction. If a transaction violates any of these, the entire transaction is rolled back to its pre-start state.

  • Triggers are automated procedures designed to perform operations in response to specific changes in a database, such as updates or insertions. In this way, triggers ensure the implementation of business rules, supporting consistency. A trigger can, for example, automatically update inventory levels when sales are recorded.
  • Stored procedures combine complex operations into single, executable scripts, reducing the risk of manual errors and maintaining data consistency.
  • Insulation levels determine the extent to which a transaction is isolated from changes made by other transactions. DBMS offers different levels of isolation that allow you to choose between performance and consistency.
  • Mechanisms of recovery use transaction logs and backups to ensure that the database is restored to a consistent state after a failure. These logs record the status of the database before and after the transaction is processed. In the event of a system failure, these records are used to restore the database to the last known consistent state by undoing or re-executing the transaction.
  • Concurrency control Multiple Data Versions (MVCC), used in some advanced databases, stores multiple versions of data. This allows read operations to access the version of the data that was consistent at the beginning of the transaction, which avoids data blocking for reads and improves performance without losing consistency.

Isolation

The main role of isolation in database transactions boils down to answering the question: when do changes made by one transaction become visible to others? Isolation ensures that transactions do not disclose their changes to each other prematurely. Otherwise, there could be various problems related to concurrency, such as:

  • Dirty Reads: They appear when a transaction reads data modified by another transaction that has not yet been approved. This can lead to errors if the first transaction makes decisions based on unapproved changes that can be rolled back.
  • Undetermined readings: The transaction receives a response from the database, returns for a re-read, and gets a different result because another transaction has modified the data in the meantime.
  • Spectra: A transaction counts rows that meet certain criteria, only to discover on next check that the number has changed because another transaction added or removed rows.
  • Sorting Anomalies: When transactions are executed without strict order, it ends in chaos — like a messy battle in which nothing goes where it should.

The isolation levels in the database manage the degree of interaction between transactions. A higher level of isolation means that transactions operate more “in a vacuum,” protected from the activity of others, which can slow down operations because the database has to put in more effort to maintain their separability.

The final choice of isolation level depends on the specific requirements of the application, including the acceptable risk of data anomalies relative to the need for simultaneous data access.

How is isolation achieved?

  • Locking mechanisms, as in the case of indivisibility, consist in restricting access to the data used by the transaction in question. Locks can be applied at different levels of detail, from the entire database to specific rows and even columns.
  • Organize by timestamps assigns each transaction a unique timestamp that determines the order in which it is executed. Transactions are processed based on their timestamps, which means that older transactions have priority, and newer ones cannot “skip the queue” and violate previously established actions.
  • Shutter Isolation uses the MVCC concept, but ensures that all read operations in a transaction see the data as it looked at the beginning of the transaction. This level of isolation supports read consistency without the need for extensive interlocks, which improves performance.
  • Serializability processes transactions sequentially, one after the other, which effectively eliminates the possibility of mutual influence of transactions. This is achieved by treating transactional conflicts in such a way that it gives the impression that transactions are processed one at a time, in order.

durability

Durability is about ensuring that once a transaction is approved, it remains in that state, permanently. It should be permanently stored and resistant to subsequent system failures, such as power outages or failures.

The main task of persistence is to place each completed transaction in reliable, non-volatile memory, so that its recording does not disappear when the system is shut down. This durability ensures that no matter what electronic cataclysms hit the system, it will be able to “wake up”, dust off the digital dust and remember everything that happened until the last approved transaction. Like an indomitable memory that, even in the face of chaos, does not forget data and promises.

For example...

Consider the banking system, in which the user transfers funds from one account to another. The transaction involves the debit of one account and the recognition of the other. After confirming transactions and updating balances, the system must ensure that these changes are permanently saved. If a power or system failure occurs immediately after the transaction is completed, but before it is saved to disk, persistence ensures that the effects of the transaction are not lost.

When the system recovers, it uses logs or other mechanisms to restore or complete the transaction record, ensuring that account balances reflect the transfer exactly as if the disruption had never occurred.

How to achieve durability?

  • Write-Ahead Logging is based on a simple but crucial principle: first write down your intentions in the log, and only then enter them into the database. If reality (i.e. the system) suddenly “rebelled” and crashed, this log allows you to recreate the last consistent state of the database before the chaos.
  • Log redo are used to perform operations again to restore the database to its current state. Undo logs can roll back changes to the previous state if required by the transaction. These logs are crucial to recovery processes, ensuring that no approved transaction is lost, as the database can always return to a consistent state after a failure.
  • Checkpoints are created periodically as snapshots of the database at a specific point in time. They contain information about the saved data and any pending changes saved in the logs. In the event of a system restart, the database uses the latest checkpoint and logs to quickly restore the database to the last consistent state.
  • Replication consists of maintaining copies of data on multiple machines or storage systems. This redundancy ensures that if one part of the system fails, other parts can continue to operate without losing data.
  • Protocols, such as the two-phase approval protocol (for distributed databases), ensure that all participating nodes either approve or roll back changes together.

The benefits of ACID properties

The properties that make up this acronym together ensure the reliable operation of databases and the security and efficiency of transactions, bringing many benefits.

Reliability and Integrity

ACID properties ensure that transactions are processed reliably, maintaining the integrity of the database even in the face of chaos — whether during a system failure or in the event of a sudden power outage. This unwavering reliability is invaluable to anyone who manages critical data such as financial data, sensitive personal information or the details of daily business operations.

Predictability of transactions

By enforcing ACID properties, developers and database administrators can predict transaction behavior much more easily. This predictability makes it easier to debug and improve application performance, as standard database behaviors and predictable transaction results are already known.

Data consistency across applications

Under the watchful eye of consistency, only the best data moves on. Each piece of data must pass rigorous testing for compliance with rules, restrictions, and triggers before it hits the database “stage”. This prevents data anomalies and integrity errors.

Increased fault resistance

The properties of durability and indivisibility increase the resistance of the system to failures. Modifications made by successful transactions are permanent, even in the event of a system failure immediately after the transaction is completed. This means that recovery processes are simpler and more reliable, which reduces the risk of data loss.

Simplified application development

By ensuring transaction integrity at the database level, ACID properties relieve developers of the obligation to implement these controls in the application logic. This simplification allows them to focus more on the business logic of the application, rather than the complexities of data consistency and recovery processes.

Disadvantages of the ACID property

While ACID properties bring many benefits to database management systems, they also come with some challenges and trade-offs, particularly in terms of performance, scalability, and system complexity.

Performance load

Strict enforcement of ACID properties can create a significant strain on performance. Operations such as logging, locking, and maintaining transaction logs to ensure indivisibility and durability require additional processing time and resources. In high-transaction environments, where fast data access and updates are key, it can feel like navigating the “sea of molasses.”

Scalability issues

Scaling an ACID-compliant database can be as complicated as coordinating a group project across time zones, especially for distributed database systems. The requirement for strict consistency and isolation levels can limit the ability of a database to grow efficiently across multiple nodes. As databases grow and more nodes appear, maintaining consistency and synchronization between them becomes more complex and requires more resources.

Complexity of management and design

Implementing and maintaining ACID properties requires advanced management techniques and a complex architecture. Each piece of the puzzle must fit together perfectly.

For example, the two-phase approval protocol, necessary to ensure indivisibility in distributed systems, is difficult to implement and manage. In addition, the need to handle various aspects of transactions, such as rollback mechanisms and concurrency controls, increases the complexity of database design and operation.

Concurrency Limitations

Isolation helps prevent transactions from interfering with each other, but it can also cause bottlenecks. Blocking mechanisms protect data integrity, but they can slow access, causing transactions to queue like customers before opening a store.

Resource requirements

Maintaining persistence and indivisibility is “resource-intensive” — it can consume system resources faster than a free buffet, affecting overall database performance and increasing operating costs, especially on transaction-heavy systems.

Decreased elasticity

The rigid structure necessary to maintain ACID properties can limit applications that require flexibility and fast response times. This rigidity can sometimes hinder performance and usability in situations where fast data processing is critical.

Difficulties in distributed environments

Maintenance of ACID properties in Distributed Environments is particularly difficult. The need for global lockdowns and coordinated approvals can lead to increased complexity and risk of bottlenecks, especially in geographically dispersed databases where network latency plays an important role.

Ultimate Consistency (Eventual Consistency)

Final consistency is the model most commonly found in distributed computing, which aims to achieve database consistency over a longer period of time. In this model, the database does not have to be immediately consistent after a transaction or update is made. Instead, the system guarantees that if no new updates are made to the data, eventually all readings of that data will return the last updated value. This approach is fundamentally different from the properties of ACID.

Ultimate consistency allows for a higher level of availability and can improve performance in distributed networks, where data propagation between nodes takes time. This model is particularly useful in situations where the system can tolerate some degree of delay in data consistency between nodes. For this reason, final consistency is often acceptable in systems such as social media channels, where access to the latest data is not critical to the operation of the application.

Differences from the properties of ACID

In the debate between ACID and ultimate consistency, each model offers unique advantages and challenges. Especially when we compare their impact on system consistency, availability, performance and complexity.

Consistency vs. Accessibility

ACID values principles. It maintains that every transaction must move the database from one correct state to another without violating any integrity principles that it scrupulously adheres to. However, this attention to detail sometimes translates into sacrificing accessibility, especially when the network decides to take a break.

In the event of a network split, ACID can lock resources to keep the data clean. It's like closing the road because one beacon stopped working. Final consistency is a more flexible solution. It distributes updates between nodes at its own pace, allowing the system to remain calm and stay online, even if it means data may be temporarily inconsistent. In this approach, a certain degree of data imperfection is accepted as long as users can continue to use the service.

Impact on performance

The performance implications of ACID are significant, mainly due to the overhead introduced by the mechanisms necessary to ensure reliable transaction processing. We are talking about mechanisms such as blocking and logging in. What is ACID in the context of databases? These processes, crucial to maintaining indivisibility and durability, can slow down the system, especially in situations with frequent write operations.

In contrast, final consistency typically shows better performance indicators, especially in environments with intensive write operations. Without the need to immediately ensure consistency across all nodes, systems that benefit from ultimate consistency reduce transaction processing delays. As a result, they enable faster responses and a smoother user experience.

Complexity of the system and overhead

Managing an ACID-compliant system is not easy. These types of systems require advanced transaction management protocols capable of handling transaction rollbacks and supporting robust approval procedures, such as a two-phase approval protocol. In addition, comprehensive error processing is necessary to deal with various situations that may compromise the integrity of the transaction. This complexity can increase the cost and resource requirements associated with maintaining such systems.

On the other hand, systems implementing ultimate consistency tend to have a simpler transaction management architecture. While fewer require immediate synchronization, they require careful design when it comes to effectively managing and resolving inconsistencies and data conflicts that naturally occur in systems without rigorous consistency controls.

Related articles