DB Isolation Levels
- MySQL – REPEATABLE READ
- PostgreSQL – READ COMMITTED
- SQL Server – READ COMMITTED
- Oracle – READ COMMITTED
- Under READ UNCOMMITTED, everything is fully transparent and everybody sees everything. If one user starts a transaction and starts editing the data, all other users will see all their changes even though they weren’t committed. This is great in cases where you know that concurrency is highly unlikely to happen (e.g. you have single writer and multiple readers or whatever).
The problem would arise if user reads something that another user decides to roll back. We call this scenario a “dirty read” and depending on the context, consequences could range from “meh” to “well, we’re in deep shit now”.
- Under READ COMMITTED, database ensures that you see only the data that has been committed to DB. This is actually pretty good for most scenarios, except, and this is amusing – except if you wanted to ensure that you don’t see different data within a same transaction.
It’s a hard to grasp scenario so I’ll give you an example – long-running analytics query; or a machine learning thingy, whatever. Under READ COMMITTED, you would see some data that has been written to DB. All good. But imagine if your transaction was a long-running one (e.g. it lasts for several minutes). What could happen is that somebody modifies the data AGAIN, and if you re-read this you’d see a different value. This problem of possibly seeing DIFFERENT data under two SELECT statements in a single transaction is something that is not ideal for all scenarios. And that’s what gets solved with REPEATABLE READ.
- Under REPEATABLE READ your transaction is guaranteed to see the EXACT SAME data all the time, for as long as transaction takes. Even if data changes in the meantime, you are still guaranteed that, in a single long-running transaction, you keep seeing the exact same values.
Frankly speaking, this is probably OK for 99% of the scenarios. Those 1% of cases would be if you want to ensure that no new data is added or removed while your transaction is running (e.g. think if your query depends on the exact number of rows). For such cases, you want to go with Seralizable Level.
- Under SERIALIZABLE level, DB ensures that concurrent transactions are executed serially (i.e. one after another). Think of it as physically locking the database while you are doing stuff in it. It’s super-slow because there’s no concurrency, but it’s effective if you need to ensure data is exactly the same.