What does one think of when they hear the word ACID? Chemists may think about a solution’s pH, mechanics may jump to some type of degreasing solvent, and festival-goers might think of the mind-altering substance that changes one’s perception of reality. The ACID that I am referring to is actually quite the opposite of the latter.
ACID is an acronym that stands for atomicity, consistency, isolation and durability. These four terms are a set of properties that belong to relational databases. Together, they ensure that during a transaction, the data isn’t corrupted, even during power failure, random errors, corrupt drives, or any other complication that may arise. While one type of acid may alter how things appear, this ACID does everything it can to make sure that everything in a relational database is exactly as it should be before, during, and after a transaction.
A transaction is simply the combination of all the statements necessary to update one or more databases. For instance, let’s say a movie theater has twelve tickets remaining. You sign into your account and buy two of them. In order for the transaction to be complete the theater must decrease the tickets available by two (one statement), and your account must have the tickets owned increase by two (a second statement).
Atom in Greek means indivisible. Transactions are composed of multiple statements, but a transaction is the smallest unit of data transfer in the realm of the relational database. This means that if any single statement fails, the entire transaction fails. The database remains unchanged, and the transaction needs to be reinstantiated. Only once all statements in a transaction are complete does the transaction complete. For this reason, a transaction cannot be observed as ongoing. A transaction has either not yet occurred, or it has. Atomicity is especially helpful in avoiding errors during power failure. This failsafe is important, because it is much easier to reinstantiate a transaction that failed than it is to clean up the mess left behind from a partial transaction. If I were two buy the two movie tickets, but the theater didn’t recognize them as being sold, they will overbook their theater for that showing, and two people will be left without seats. It won’t be me, because I get there really early and crush it at movie trivia, but you get the picture.
Consistency is key! We’ve all heard this our entire lives. When it comes to a database this couldn’t possibly be more true. Consistency ensures that the data being written in a transaction does not change the database from a valid state to an invalid state. An invalid state is a state that does not adhere to the predefined rules of the database. These rules often come as constraints, cascades, or triggers. Each of these consistency properties guarantees that the transaction is valid, but they do not guarantee that the transaction is correct.
Integrity constraints come in three types; entity integrity, referential integrity, and domain integrity. Entity integrity states that every table must have an ID column, which represents the entry’s primary key, and that primary keys cannot be duplicated or be null. Referential integrity is based on the concept of foreign keys. The foreign keys can only be in one of two states; either the key references another existing entry in some table or it is null. Domain integrity is basically type integrity. Domain integrity check that the value is of valid type for its column.
Cascades are rollbacks that occur when failed transaction results in multiple other transactions needing to be rolled back. Often a failed transaction leads to subsequent failed transactions, so the cascade rollbacks help avoid this from being the case.
A trigger is a procedural code that is automatically executed in response to a transaction. For instance, a new student is being added to a school’s database. A trigger is set so that once the new student is added, other tables get updated with pertinent entries. The student’s grade book is set up, their dining card is implemented, and their bursar account is built. All of these may exist in separate tables, but a failure of any one of these triggers will result in a failed transaction.
Relational databases need to allow for transactions to take place concurrently. They would be far too slow were they to happen sequentially. Imagine going to buy your movie tickets online, but having to get in line for the transaction to take place. Not just the purchase of the tickets, but to read the database to see if the how many tickets even exist! That would be absurd. Isolation allows for reading and writing to happen at the same time, but not be aware of every other transaction that is taking place. It is the property that decides when the changes made by one transaction become visible to every other transaction. This ‘concurrency control’ is handled by a database management system, such as PostgreSQL, and can have many different mechanisms put in place to guarantee isolation. However, for the sake of this article, the important takeaway is that transactions are unaware of each other.
The transaction took place. Fantastic! Your movie tickets have been purchased. But wait! The power that houses the movie theater’s servers went out! Never fear, durability is here. Durability ensures that even if the power goes out, or if the system crashes and needs to be rebooted, that your transaction will remain in tact. The tickets are still yours, because the servers committed your transaction to non-volatile memory (NVM, memory that doesn’t require power to function, such as RAM). By saving the transaction in some form of NVM, such as flash, ROM, or magnetic hard drives, the database is durable. It can withstand system crashes and power outages and your tickets are still good to go.
Each of these properties could very easily encompass a blog article on their own. I hope this introduction gives you a good outline to begin with. Happy coding!
What is a relational database?
A relational database is a type of database that stores and provides access to data points that are related to one…
Isolation Levels in the Database Engine
Transactions specify an isolation level that defines the degree to which one transaction must be isolated from resource…