Skip to content

Abstracting away locks from silicondb::map

Arindam Das edited this page Jul 2, 2021 · 17 revisions

Abstracting away locks from the shared::mutex locked map implementation to an internal fine-grain locked singly linked list.

One of our driving design decisions for the concurrent map was to specify the bucket as the unit of data for locking. On careful inspection, we observe that we can further enable concurrency by using a custom thread-safe, fine grain locked data structure for our buckets.

Let us consider our options:

  • a vector
  • a singly linked list

As mentioned previously, since a vector is a random access data structure, we need to obtain a lock on the entirety of the vector for maintaining data consistency.

That leaves us with our only option, singly-linked lists. Before we discuss how to make it thread-safe, let us decide the operations we need to support for the list:

  • Addition of elements. O(1) time addition at the head with push_front() will do.
  • A means to traverse along the list. A for_each() function that accepts a lambda to operate on the traversed elements is perfect.
  • Element lookup. find_first_if() takes a predicate lambda for selecting elements and returns a shared_ptr to the element.
  • Element deletion. remove_if() takes a predicate lambda for selecting elements and deletes them from the list.

Now let's design the data structure to enable fine-grained locking.

^[ |( )| -]---[ 1 |( )| -]--->[ 2 |( )| -]--->[ 3 |( )| -]--->[ 4 |( )| -]

// fig: a list with 4 nodes

( ) : shared lock
1, 2.. etc. : data elements
--> : next pointer

^ indicates that the node is in stack memory. Its absence indicates that
the node is in heap memory.

In our design every node has the following data elements:

  • A shared_ptr to the data element
  • A unique_ptr to the next node in the list
  • A shared mutex to lock on to the sublist starting from that node.

The head of the list is a sentinel node that contains only a lock and no data.

Why is the head a sentinel node?

To eliminate the need for a special head case in different list operations.

Why do we need to need to associate every node in the list with a shared mutex?

The shared_mutex acts as a locked gate, allowing threads to pass through it only when unlocked. The linked list in effect behaves as a corridor with multiple gates in it. If a gate closes behind you, you can still proceed. You only stop when a gate closes in front of you. The act of walking in the corridor maps to list traversal for a thread, and locked gates are shared_mutexes. shared_mutexes support multiple readers and a single writer.

Sequence of locks for every node operation

Every node operation requires two locks. A lock on the node previous to it, and a lock on the node being operated on.

Let's assume we are at the current node. We require to operate on the next node.

// figure depicting the sequence of locking on the nodes

^[ |( )| -]---[ 1 |(x)| -]--->[ 2 |( )| -]--->[ 3 |( )| -]--->[ 4 |( )| -]

current node = 1, we are in the middle of traversal and already have a lock on 1

step #1, check if next node exists
^[ |( )| -]---[ 1 |(x)| -]--->[ 2 |( )| -]--->[ 3 |( )| -]--->[ 4 |( )| -]

step #2, obtain lock on next node
^[ |( )| -]---[ 1 |(x)| -]--->[ 2 |(x)| -]--->[ 3 |( )| -]--->[ 4 |( )| -]

step #3, read operation, unlock 1
^[ |( )| -]---[ 1 |( )| -]--->[ 2 |(x)| -]--->[ 3 |( )| -]--->[ 4 |( )| -]

step #4
compute fn(2)

step #5
copy next locks and pointer to current and move on.
^[ |( )| -]---[ 1 |( )| -]--->[ 2 |(x)| -]--->[ 3 |(x)| -]--->[ 4 |( )| -]

This is the sequence of locks during list traversal

  • First, we lock on the current node
  • Since we have a lock on the current node, it is guaranteed this node will not be deleted.
  • We check if the next node exists. If not we exit. If it exists, we try to obtain a lock on the next node.
  • Once we obtain a lock on the next node, it's guaranteed it won't be modified.
  • Now if it's a deletion operation (on the next node),
    • we still hold the lock on the current node
    • Delete the next node.
    • obtain the new next node. lock on it and follow the steps mentioned above.
    • only if the next node is not to be deleted, we unlock the lock on the current node.
    • copy the next node lock and next node pointer to current and move ahead.
  • If it is a read operation (on the next node),
    • we release the current node lock
    • operate on the next nodes data as required
    • copy the next node lock and pointer to the current node and move on.

Now that we have a model of the design in our head, let's implement it.

The list definition begins as:

template <typename T>
class list {
...

Our node definition:

    private:
        // list::node: node in a linked list. Every node contains a unique pointer
        // to the next node, thereby having sole ownership of the next node.
        // Furthermore, every node contains a mutex to enable unique (read/write)
        // lock on the sub-list starting from that node.
        struct node
        {
            std::unique_ptr<node> next; // unique ownership of the next pointer
            std::shared_mutex subl_mtx; // mutex to lock on the sublist starting
                                        // from this node

            std::shared_ptr<T> data; // shared ownership of data with pointer

            node() : next{} {}
            node(T const &data_) : data{std::make_shared<T>(data_)} {}
        };

We declare our head sentinel node:

    private:
        node head; // sentinel head node

Now we implement our list operations:

list::push_front()

    public:
        void push_front(T const &value)
        {
            std::unique_ptr<node> node_created{new node(value)};
            std::unique_lock<std::shared_mutex> lk(head.subl_mtx);

            node_created->next = std::move(head.next);
            head.next = std::move(node_created);
        }

Since the next pointer is stored in a unique_ptr, we require move operations. We obtain a unique lock to the head sentinel node. So that we can move head.next out of it. Once the new node is initialized, we move the new node into node.next. Following, RAII semantics, lk releases the lock when the function exits.

After this, we implement the reader operations

list::for_each()

    public:
        template <typename Function>
        void for_each(Function f)
        {
            node *current = &head;
            std::shared_lock<std::shared_mutex> lk(head.subl_mtx);

            while (node *const next = current->next.get())
            {
                std::shared_lock<std::shared_mutex> next_lk(next->subl_mtx);
                lk.unlock();

                f(*next->data);

                current = next;
                lk = std::move(next_lk);
            }
        }

list::find_first_if()

    private:
        template <typename Predicate>
        std::shared_ptr<T> find_first_if(Predicate p)
        {
            node *current = &head;
            std::shared_lock<std::shared_mutex> lk(head.subl_mtx);

            while (node *const next = current->next.get())
            {
                std::shared_lock<std::shared_mutex> next_lk(next->subl_mtx);
                lk.unlock();

                if (p(*next->data))
                    return next->data;

                current = next;
                lk = std::move(next_lk);
            }

            return std::shared_ptr<T>{};
        }

The reader functions operate locks similarly. The only point of their difference being how they exit the function. The follow the lock acquisition as described in the previous section with the diagram.

list::remove_if()

    public:
        template <typename Predicate>
        void remove_if(Predicate p)
        {
            node *current = &head;
            std::unique_lock<std::shared_mutex> lk(head.subl_mtx);

            while (node *const next = current->next.get())
            {
                std::unique_lock<std::shared_mutex> next_lk(next->subl_mtx);
                if (p(*next->data))
                {
                    std::unique_ptr<node> old_next = std::move(current->next);
                    current->next = std::move(next->next);
                    next_lk.unlock();
                }
                else
                {
                    lk.unlock();
                    current = next;
                    lk = std::move(next_lk);
                }
            }
        }

    }; // end of class silicondb::list

As described above, we first acquire a lock on the current node. We check if the next node exists. If it exists, we acquire a lock on it. Then we check its data member with the given predicate. If the predicate returns true, we delete the node and unlock the lock acquired for the currently deleted node. We continue looping through this process as long as the data members in the next nodes meet the deletion criteria. Only when we find a "next" node with a data member rejected by the predicate, we release the lock on the current node and move on.

This concludes our list implementation. (Irrelevant portions were omitted for brevity.)

However, our primary objective was to abstract away locks from silicondb::map We discuss that in the next section.

Abstracting away locks

We remove all mutexes from our bucket definition within silicondb::map. It greatly simplifies to this:

// map: threadsafe map implemented as a hash table.
    template <typename Key, typename Value, typename Hash = std::hash<Key>>
    class map
    {
    private:
        // map::bucket: hash bucket within the hash table, for storing
        // entries with clashing hash values.
        class bucket
        {
        private:
            typedef std::pair<Key, Value> bucket_value;

            list<bucket_value> data;

            // map::bucket::find_entry_for: returns a iterator to the first
            // (key, value) pair within this buckets data list with given
            // argument key.
            std::shared_ptr<bucket_value> find_entry_for(Key const &key)
            {
                return data.find_first_if([&](bucket_value const &item)
                                          { return item.first == key; });
            }
...

Instead of returning an iterator, we return a shared_ptr to the (key, value) entry stored in the bucket, which allows modification to the value stored. Since this is safe only for internal usage, we keep it private to the bucket class.

Next we refactor our bucket public operations to use this method:

        // inside map::bucket
        public:
            // map::bucket::get: returns the value mapped to the given key
            // if present, the default_value passed otherwise
            Value get(Key const &key, Value const &default_value)
            {
                auto const entry = find_entry_for(key);
                return entry ? entry->second : default_value;
            }

            // map::bucket::contains: returns true if this key has a value
            // associated with it, and that entry is stored in this bucket
            bool contains(Key const &key)
            {
                auto const entry = find_entry_for(key);
                return bool(entry);
            }

            // map::bucket::put: establishes a key-value mapping for the
            // key-value pair. Creates a new key-value pair in the bucket
            // data if the mapping didn't exist before. Otherwise, it
            // overwrites the old mapping with the new value.
            void put(Key const &key, Value const &value)
            {
                auto const entry = find_entry_for(key);
                if (entry)
                    entry->second = value;
                else
                    data.push_front(bucket_value(key, value));
            }
...

The bucket::unmap() method is a bit unique in the sense that it directly invokes the list::remove_if() function of the list.

        public:
            // map::bucket::unmap: erases the value associated with this
            // key, if present, from the bucket data.
            void unmap(Key const &key)
            {
                data.remove_if([&](bucket_value const &item)
                               { return item.first == key; });
            }
...

These are the only changes required. The rest of the public API implementation of silicondb::map remains the same.