PostgreSQL: Row Locks
Importance of Row Locks
For the past couple of years, I've been building and maintaining E-commerce web applications that serve thousands of customers. Whenever there are data inconsistency problems in the database, there's a big chance that the company is losing money due to multiple problems happening at the same time (e.g. overselling issues).

There are many reasons why data inconsistency issues occur and one of them involves race conditions. Race conditions are hard to debug since these usually occur with multiple concurrent sessions and transactions. This post is about the use of row-level locks using ActiveRecord to prevent race conditions in the read-modify-write cycle.

Setup
Our setup for this post consists of a single table called repetitions. To have a better appreciation of locks, we'll start with the concurrency issues which may occur when doing read and update operations without locks. After that, we'll try concurrent read and writes with exclusive row locks.



      ActiveRecord::Schema.define(version: 20150510072109) do
        enable_extension "plpgsql"

        create_table "repetitions", force: :cascade do |t|
          t.integer  "count"
          t.datetime "created_at", null: false
          t.datetime "updated_at", null: false
        end
      end
      

After the creation of the table above, we'll create a record with a count of 100. We'll be using this record for the examples in this post.



      repetition = Repetition.create(count: 100)
      

Simple Update Operation without Locks
When incrementing the count of a field, it is common to see the following lines of code when using ActiveRecord. When executed in a local environment, everything may appear fine to developers. Unfortunately a race condition will occur when the following lines of code are run in a production environment where multiple users may read and update the same row at the same time.



      repetition = Repetition.find(2)
      repetition.count += 1
      repetition.save!      
      

How? The Ruby code above translates to the SQL statements below. Take note that we're using PostgreSQL in this example. By default, PostgreSQL uses the read committed isolation level. This means that a SELECT query does not see (yet) the changes committed during query execution by other concurrent transactions. With the example below, another concurrent transaction might use the old value from the SELECT query to update the same row. Even if the count is already 101 in one transaction, the count value read might still be 100 in another concurrent transaction.



      SELECT  "repetitions".* FROM "repetitions" 
          WHERE "repetitions"."id" = $1 LIMIT 1  [["id", 2]]
      
      BEGIN
      UPDATE "repetitions" SET "count" = $1, "updated_at" = $2 
          WHERE "repetitions"."id" = $3  
              [["count", 101], 
               ["updated_at", "2015-05-10 12:12:59.981593"], 
               ["id", 2]]
      COMMIT
      

Let's see in the next section how multiple concurrent transactions will cause integrity problems with the read-modify-write cycle example above.

Testing concurrency without locks
We'll write a simple class that creates 5 threads and performs the read-modify-write increment operation concurrently without locks. I've intentionally put sleep calls to trigger the race conditions.



      class ConcurrencyTester
        def initialize(resource: resource)
          @resource = resource
        end

        def process!
          threads = []

          5.times do
            threads << Thread.new do
              increment_without_locking
            end
          end

          threads.each do |thread|
            thread.join
          end
        end

        def increment_without_locking
          ActiveRecord::Base.connection_pool.with_connection do
            sleep(rand())
            @resource = Repetition.find(@resource.id)
            sleep(rand())
            @resource.count += 1
            sleep(rand())
            @resource.save!
          end
        end
      end
      

We'll then set the value of the count to 100 and run the ConcurrencyTester class.



      repetition = Repetition.first
      repetition.count = 100
      repetition.save!

      ct = ConcurrencyTester.new(resource: repetition)
      ct.process!

      puts repetition.reload.count
      => 103 WRONG
      

After reloading the count, we see that the count is 103 and not 105. Take note that wrapping the operations above with a transaction does not fix the concurrency issue. The SQL statements below were executed by the concurrent operations above. You'll see that the last update operation had the value of 103 since it got a count of 102 from its previous SELECT query. We'll actually get different values everytime we execute the same lines of code due to the sleep calls.



        Repetition Load (0.2ms)  SELECT  "repetitions".* FROM "repetitions" 
            WHERE "repetitions"."id" = $1 LIMIT 1  [["id", 2]]
        Repetition Load (0.3ms)  SELECT  "repetitions".* FROM "repetitions" 
            WHERE "repetitions"."id" = $1 LIMIT 1  [["id", 2]]
        Repetition Load (0.4ms)  SELECT  "repetitions".* FROM "repetitions" 
            WHERE "repetitions"."id" = $1 LIMIT 1  [["id", 2]]
        Repetition Load (0.6ms)  SELECT  "repetitions".* FROM "repetitions" 
            WHERE "repetitions"."id" = $1 LIMIT 1  [["id", 2]]
            (0.3ms)  BEGIN
            (0.3ms)  COMMIT
        Repetition Load (0.3ms)  SELECT  "repetitions".* FROM "repetitions" 
            WHERE "repetitions"."id" = $1 LIMIT 1  [["id", 2]]
            (0.4ms)  BEGIN
        SQL (0.2ms)  UPDATE "repetitions" SET "count" = $1, "updated_at" = $2 
            WHERE "repetitions"."id" = $3  [["count", 102], 
                                            ["updated_at", "2015-05-10 10:46:12.500102"], 
                                            ["id", 2]]
            (0.5ms)  COMMIT
            (0.3ms)  BEGIN
            (0.2ms)  COMMIT
            (0.3ms)  BEGIN
        SQL (0.2ms)  UPDATE "repetitions" SET "count" = $1, "updated_at" = $2 
            WHERE "repetitions"."id" = $3  [["count", 103], 
                                            ["updated_at", "2015-05-10 10:46:12.856025"], 
                                            ["id", 2]]
            (0.5ms)  COMMIT
            (0.2ms)  BEGIN
            (0.2ms)  COMMIT
      

This is where locks come in. There are multiple ways of doing this. We can lock the entire table (which is a bad idea) or we can explicitly lock the affected row only.

Exclusive Row Locks
The following lines of code demonstrate the use of exclusive row locks in ActiveRecord. We're using the module ActiveRecord::Locking::Pessimistic for row-level locking. There are multiple ways of doing this but we'll use the one below.



      Repetition.transaction do
        repetition = Repetition.lock.find(2)
        repetition.count += 1
        repetition.save!
      end
      

The Ruby code above translates to the SQL statements below. Take note that the FOR UPDATE keywords were appended to the SELECT statement. The SELECT FOR UPDATE locks the entire row until the transaction commits or rolls back. Update operations may be done without fear of conflict or race conditions.



      BEGIN

      SELECT  "repetitions".* FROM "repetitions" 
          WHERE "repetitions"."id" = $1 LIMIT 1 
          FOR UPDATE  [["id", 2]]

      UPDATE "repetitions" SET "count" = $1, "updated_at" = $2 
          WHERE "repetitions"."id" = $3  
              [["count", 101], 
               ["updated_at", "2015-05-10 11:51:41.624400"], 
               ["id", 2]]
      
      COMMIT
      

We'll update our ConcurrencyTester class in the next section with the support for exclusive row locks.

Testing concurrency with exclusive row locks
We'll start by writing an increment method that makes use of row-level locks. We've also intentionally placed sleep calls to force race conditions in case the locks don't work.



      class ConcurrencyTester

        ....
        
        def process!
          threads = []

          5.times do
            threads << Thread.new do
              increment_with_locking
            end
          end

          threads.each do |thread|
            thread.join
          end
        end

        ....

        def increment_with_locking
          ActiveRecord::Base.connection_pool.with_connection do
            sleep(rand())

            Repetition.transaction do
              @resource = Repetition.lock.find(@resource.id)
              sleep(rand())
              @resource.count += 1
              sleep(rand())
              @resource.save!
            end
          end
        end

        ...
      end
      

Next, we reset the value of the count to 100 and run the ConcurrencyTester again. Now, no matter how many times we execute this, we'll get the value of 105.



      repetition = Repetition.first
      repetition.count = 100
      repetition.save!
      
      ct = ConcurrencyTester.new(resource: repetition)
      ct.process!

      puts repetition.reload.count
      => 105 CORRECT
      

You can see below the SQL statements executed by the concurrent operations done above.



            (0.2ms)  BEGIN
        Repetition Load (0.2ms)  SELECT  "repetitions".* FROM "repetitions" 
            WHERE "repetitions"."id" = $1 LIMIT 1 FOR UPDATE  [["id", 2]]
            (0.4ms)  BEGIN
            (0.3ms)  BEGIN
            (0.5ms)  BEGIN
        SQL (0.3ms)  UPDATE "repetitions" SET "count" = $1, "updated_at" = $2 
            WHERE "repetitions"."id" = $3  [["count", 101], 
                                            ["updated_at", "2015-05-10 10:53:11.457792"], 
                                            ["id", 2]]
        Repetition Load (272.2ms)  SELECT  "repetitions".* FROM "repetitions" 
            WHERE "repetitions"."id" = $1 LIMIT 1 FOR UPDATE  [["id", 2]]
            (0.5ms)  COMMIT
            (0.4ms)  BEGIN
        SQL (0.2ms)  UPDATE "repetitions" SET "count" = $1, "updated_at" = $2 
            WHERE "repetitions"."id" = $3  [["count", 102], 
                                            ["updated_at", "2015-05-10 10:53:12.394803"], 
                                            ["id", 2]]
            (0.5ms)  COMMIT
        Repetition Load (1208.9ms)  SELECT  "repetitions".* FROM "repetitions" 
            WHERE "repetitions"."id" = $1 LIMIT 1 FOR UPDATE  [["id", 2]]
        SQL (0.3ms)  UPDATE "repetitions" SET "count" = $1, "updated_at" = $2 
            WHERE "repetitions"."id" = $3  [["count", 103], 
                                            ["updated_at", "2015-05-10 10:53:12.661593"], 
                                            ["id", 2]]
            (0.5ms)  COMMIT
        Repetition Load (1233.2ms)  SELECT  "repetitions".* FROM "repetitions" 
            WHERE "repetitions"."id" = $1 LIMIT 1 FOR UPDATE  [["id", 2]]
        SQL (0.3ms)  UPDATE "repetitions" SET "count" = $1, "updated_at" = $2 
            WHERE "repetitions"."id" = $3  [["count", 104], 
                                            ["updated_at", "2015-05-10 10:53:13.308481"], 
                                            ["id", 2]]
        Repetition Load (1720.1ms)  SELECT  "repetitions".* FROM "repetitions" 
            WHERE "repetitions"."id" = $1 LIMIT 1 FOR UPDATE  [["id", 2]]
            (0.5ms)  COMMIT
        SQL (0.2ms)  UPDATE "repetitions" SET "count" = $1, "updated_at" = $2 
            WHERE "repetitions"."id" = $3  [["count", 105], 
                                            ["updated_at", "2015-05-10 10:53:14.230289"], 
                                            ["id", 2]]
           (0.4ms)  COMMIT

      

You can see with the statements above that each SELECT statement waits for the previous transaction to commit. Compared to the queries without locks, this one proves to be consistent and prevents race conditions. For more information on explicit locking, click here.

Knowledge on locks and concurrency is critical when building applications that serve multiple customers since concurrency issues are also hard to debug.