Home > Threading > Distributed Lock with AppFabric Caching

Distributed Lock with AppFabric Caching


When building distributed applications you often need to synchronize access to shared resources. For example you might have multiple instances of a web service running on a web farm and you need to only have one of those services perform a given task. Steve Marx proposed a solution for Windows Azure which relies on leases but unfortunately that will not work on premise. Other caching products, such as Redis, offer this functionality out of the box so I thought it would be useful to have this for AppFabric.

I’ve developed a DataCache extension method that works the same way as ServiceStack’s Redis Client. Here’s how to use it:

DataCache cache = factory.GetCache("MyCache");
using (cache.AcquireLock("MyLock"))
{
    // Lock acquired, perform synchronized work here
}

The AcquireLock method will block until it can acquire an exclusive lock or until the optional timeout value is reached. Here is the implementation:

public static IDisposable AcquireLock(this DataCache cache, string key, TimeSpan? timeout = null)
{
    if (cache == null)
    {
        throw new ArgumentNullException("cache");
    }
    if (key == null)
    {
        throw new ArgumentNullException("key");
    }

    return new DataCacheLock(cache, key, timeout);
}

Very simple so far, the core of the logic is in the private DataCacheLock class:

private class DataCacheLock : IDisposable
{
    private DataCache _cache;            
    private DataCacheLockHandle _lockHandle;
    private string _key;

    public DataCacheLock(DataCache cache, string key, TimeSpan? timeout)
    {
        _cache = cache;
        _key = key;
        RetryUntilTrue(() =>
        {
            try
            {
                cache.GetAndLock(key, timeout ?? TimeSpan.MaxValue, out _lockHandle);
                return true;
            }
            catch (DataCacheException ex)
            {
                if (ex.ErrorCode == DataCacheErrorCode.KeyDoesNotExist)
                {
                    try
                    {
                        cache.Add(key, string.Empty);
                    }
                    catch (DataCacheException)
                    {
                    }
                    return false;
                }
                else if (ex.ErrorCode == DataCacheErrorCode.ObjectLocked)
                {
                    return false;
                }
                throw;
            }
        }, timeout);
    }

    public void Dispose()
    {
        if (_lockHandle != null)
        {
            _cache.Unlock(_key, _lockHandle);
            _lockHandle = null;
        }
    }           
}

Essentially a client will try to GetAndLock the specified key. If the key doesn’t exist it will create it. If it exists but is already locked, it will retry indefinitely or until the timeout value is reached.

I am not a big fan of having logic implemented around exceptions but we don’t have any other choices here with the AppFabric Caching API. You’ve probably noticed the use of the helper method RetryUntilTrue, here’s how this is implemented:

private static readonly Random _random = new Random();

private static void RetryUntilTrue(Func<bool> action, TimeSpan? timeout)
{
    int i = 0;
    DateTime utcNow = DateTime.UtcNow;
    while (!timeout.HasValue || DateTime.UtcNow – utcNow < timeout.Value)
    {
        i++;
        if (action())
        {
            return;
        }
        Thread.Sleep(_random.Next((int)Math.Pow(i, 2), (int)Math.Pow(i + 1, 2) + 1));
    }
    throw new TimeoutException(string.Format("Exceeded timeout of {0}", timeout.Value));
}

This will make the client retry at increasingly random intervals.

One thing to keep in mind, if you don’t specify a timeout and the client holding the lock terminates abruptly (Dispose doesn’t get called), all your other clients will be deadlocked indefinitely. A possible solution to avoid this behavior would be to limit the timeout to 1 minute internally and implement a timer within the DataCacheLock class that would call ResetObjectTimeout every 30 seconds and extend the object’s timeout for another minute for example.

Advertisements
  1. August 20, 2013 at 5:26 am

    This is interesting, what happens to the lock should the acquiring process crash?

    • August 20, 2013 at 11:53 am

      The last paragraph explains the behavior for an aborted process holding the lock. If a process is terminated gracefully following a crash then it releases the lock. In the case of a process that is acquiring but hasn’t acquired the lock yet, a crash, graceful or not, has no influence on the state of the lock.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: