DB Transactions Are NOT Mutexes!
Key takeaway: DB transactions might be ran concurrently. Your transaction body must take concurrency into account and handle possible race conditions in statement execution.
The Issue
Not long ago, I was debugging an issue we had in our system.
The symptom was duplicate rows inserted into the DB in a flow where we expected only a single row to be written each time.
The code producing the duplicate rows was (pseudo code):
Basically, we check if the row we’re going to insert already exists (line 3) and in case it doesn’t — we insert it (line 5), and all is done within a transaction.
So how come we ended up inserting duplicate rows?
Transaction & Concurrency
DB transactions might be mistaken to be mutexes — but they are not. In fact, DB transactions have 4 levels (called isolation levels), each with different guarantees.
I won’t dive into the details of each of these isolation levels (you can read more thoroughly about PostgreSQL isolation levels here) but there are two important facts about them:
- The default isolation level allows concurrent transaction execution.
- Only the highest isolation level (called serializable) guarantees sequential transaction execution — but has a performance penalty you would like to avoid.
the meaning of that is: you have to deal with concurrency and possible race conditions even when running within a transaction.
To demonstrate what went wrong in our code — let’s draw a timeline of two concurrent transactions, T1 and T2, running:
The sequence of events is:
- T1 starts
- T2 starts
- T1 runs the select query — and gets an empty result
- T2 runs the select query — and gets an empty result
- T1 inserts a new record since it received an empty result in step 3
- T2 inserts a new record since it received an empty result in step 4
- T1 commits
- T2 commits
We ended up with two written rows even though each select/insert combination was running within a transaction.
The point is — a transaction is not a mutex (unless running in serializable isolation level). Transactions run concurrently and are subject to race conditions the same way separate queries are.
In the specific case above, the solution we chose was:
- Add a unique index constraint to the table.
- Use PostgreSQL UPSERT (INSERT … ON CONFLICT DO NOTHING RETURNING *) statement to either insert a new row or return an existing one.
We had several benefits from this move:
- First and foremost — the issues was gone since we couldn’t have written duplicate rows with the same parameters (the unique index enforced that on the DB level)
- We didn’t need the DB transaction anymore since we reduced the two queries into a single, atomic statement (UPSERT).
- Application-side code got redundant since we delegated this logic to the DB so we ended up with less complex code.
Conclusion
Don’t be fooled by the promise of default isolation level transactions. Take into account the possibility that multiple transactions run concurrently and their statements might interleave in runtime.
Make sure your code is resilient to race conditions by handling all possible scenarios of multiple concurrent transactions.