This will delete the page "Home"
. Please be certain.
XRCU is a library that provides efficient lock-less synchronization for read-mostly tasks and structures
This file is meant to document the XRCU library, including internal details and the rationale behind the decisions taken.
This document assumes that the reader is at least somewhat familiar with such concepts as multi-threading, lock-free algorithms and data structures, and read-copy-update (RCU).
The term lock free is used to refer to several different things depending on the author, but the basic idea is that an algorithm or structure is lock free if multiple threads can operate on it without the end result being dependent on said threads' scheduling. That is, it doesn't matter if the involved threads are suspended or preempted; progress is guaranteed.
The usually mentioned advantage of lock freedom is performance. When you are using lock free algorithms, you naturally avoid using things like mutexes, which can have a non-negligible performance impact. On the other hand, other people point out that lock free algorithms are typically harder to understand, that debugging is notoriously difficult, and that the benefits generally do not compensate for these complexities.
The reason why this library was developed is to give other programmers another option when designing high performant, low latency systems. We do not claim that lock freedom is a silver bullet when it comes to making programs faster. However, it's also hard to argue that lock contention tends to be a real bottleneck, and several projects have moved towards lock free structures given the performance benefits they bring (see for example, many kernels).
There is a big problem with using lock free algorithms in programming languages that feature manual memory management, and it has to do with memory reclamation. Since lock freedom implies that multiple threads may operate on a given structure concurrently, it is possible to find ourselves in a situation in which a thread is reading some data, while another one is deleting that same piece of data, potentially freeing the memory associated to it.
As such, lock freedom needs an additional twist in languages like C and C++. We need an additional subsystem that allows us to synchronize reclamation without using heavyweight methods like mutexes (we are developing a lock-free library, after all).
This is where RCU comes in. RCU, short for read-copy-update, is a mechanism that allows multiple threads to read from memory, while postponing reclamations until it is safe to do so (i.e: until no readers are operating on said memory). RCU typically assumes that updates are performed using atomic instructions, and isn't concerned with them (At least, that's the case in this library).
There are several different ways to implement RCU, and the one chosen in this library will be detailed later. For now, it's important to point out that RCU typically makes reading from shared memory a very cheap operation, while imposing more overhead on updates. Thus, it's more suited for tasks in which reads are more frequent than modifications.
It should be noted that RCU is not the only way to manage memory reclamations in a multi-threaded environment, but as its name implies, XRCU chose it because it was deemed the best option. Since our lock-free structures rely on RCU to work properly, we'll describe the RCU interface that this library exposes first.
All public interfaces (functions, types, methods, etc) reside in the namespace xrcu
. It's the explicit intent of the authors to mantain compatibility with every interface that was ever exposed. Any breaking changes should be done in different interfaces (either by adding new functions, or by overloading).
The implementation of some template types require additional, internal details, and those usually reside in a namespace appropriately named detail
(Said namespace is inside xrcu
). It goes without saying that users should not rely on those internal details, and that the authors may freely break compatibility with them.
XRCU uses some thread-specific data in order to function properly. There may be some systems that have issues when dynamically loading libraries that use thread-specific data (I believe Windows Vista was among them). Consult with your working environment's manual to verify that if you notice any problem.
Finally, it should be mentioned that XRCU requires no initialization or cleanup at a library level to function properly. Once the library is installed, you can freely use its API without any further action needed.
Note that XRCU may expose more functions or types that are specified in this document. This is usually done to avoid code bloat. Anything that isn't included in this file should be treated as private and subject to change.
#include <xrcu/xrcu.hpp>
The following section documents the RCU API that is exposed in XRCU. It's used internally quite a bit by other parts of the library, yet it's very useful by itself when implementing other concurrent algorithms or data structures.
RCU is based on the concept of critical sections, code fragments that are executed under the guarantee that no reclamation can take place. This allows the user to safely read from shared memory, knowing that it will be valid during the critical section. Writes have to be synchronized independently of RCU, however.
Naturally, RCU critical sections don't prevent all memory reclamation, they only affect the destruction of objects that are derived from a type defined in XRCU, called finalizable
. If you wish to use the RCU API, all you need to do is make your types derive from finalizable
and use its API, and you'll be set. Any other memory management functions, like malloc
and friends are unaffected by RCU as implemented in this library.
Critical sections are managed through the following API:
Makes the calling thread enter a critical section. Critical sections can be safely nested without problem. As long as we are in one, finalizable
objects cannot be reclaimed.
Exits a critical section. This is generally called after the calling thread is done reading from shared memory, and wants to signal that reclamations should be re-allowed. The effects of calling exit_cs
when the calling thread is not in a critical section are undefined.
Returns true if the calling thread is in a critical section; false otherwise.
Waits until all threads are outside pre-existing critical sections, and returns true afterwards. If a deadlock is detected (because the calling thread is in a critical section, for example), this function returns false immediately without waiting.
This type is defined such that its constructor calls enter_cs
, and its destructor calls exit_cs
; it has no internal state. As its name implies, it's useful as a guard to manage entering and exiting a critical section in a way that is exception safe. A few types in this library are derived from cs_guard
, such as iterators, since they need to examine potentially many elements from a container without their memory being reclaimed.
As it was mentioned before, XRCU defines a type called finalizable
that is specifically designed to make its destruction safe (i.e: Only once all threads are outside a critical section). This type defines the following interface:
struct finalizable
{
virtual void safe_destroy ();
virtual ~finalizable ();
};
Under most circumstances, it's enough for a user-defined type to derive from finalizable
and leave it at that. However, the above 2 methods are provided as virtual for customization's sake. When a finalizable
object is reclaimed by the RCU subsystem, it will call the safe_destroy
method. The default implementation simply calls the object's destructor and frees the memory associated to it. If, for whatever reason, a user wants to override this behaviour, they may do so by extending either of those methods.
In addition, the following API is available when dealing with finalizables:
Adds the object F
to the calling thread's list of pending finalizable objects. Each thread has a limit on the number of pending finalizables
. Once that limit is reached, they are scheduled for reclamation, and will be collected once it's safe to do so.
Only a single call to finalize
is allowed on a particular object. If this function is called more than once on the same object (Either by the same thread, or another), the behaviour is undefined.
Schedule all the calling thread's accumulated finalizable
objects for reclamation immediately. Returns true if successful, false if a deadlock was detected. Note that this call may block until all threads are outside a critical section (Much as a call to sync
would).
Once a thread exits, an implicit call to this function is made.
These functions don't really belong anywhere else, but they are included in this file for convenience's sake:
struct atfork
{
void (*prepare) (void);
void (*parent) (void);
void (*child) (void);
};
Returns a structure that implements the needed callbacks for XRCU to mantain a consistent state across calls to fork
. The returned callbacks should be passed to pthread_atfork
(or similar). XRCU itself does not install these callbacks, because most calls to fork
are followed immediately after exec
, and because multi-threaded programs that call fork
are rare.
Returns the library version as a pair of major
, minor
. Useful to assert that a program is using the correct version (at runtime).
There are many ways to implement RCU. In this library, it was fundamental that RCU be implemented in a portable, simple way, even if it meant slightly more overhead. As such, things like signals and OS-specific syscalls were out.
In the end, for XRCU, we decided to use a global registry that uses a stamp to monitor which threads are in critical sections. By using C++ thread_local
objects, we avoid the need for explicit registration, so that when a thread first uses the RCU API, it is added automatically to the global registry.
In order to enter a critical section, all a thread has to do is read a global counter, usually bump it by some small value, and then store that value with release semantics in its thread-specific data. Exiting a critical section is almost entirely symmetric (We decrement the thread-specific value), but with a small caveat that will be explained below.
When an object is finalized, it's prepended to a singly-linked list that is also kept in thread-specific storage. Once a certain number of them have been accumulated (specified by the constant XRCU_MAX_FINS
), they are scheduled to be reclaimed. However, if the calling thread is inside a critical section at that point, a special flag is set instead, that tells the thread to immediately flush its finalizable
objects once it's outside the critical section.
In this implementation, the most expensive operation is undoubtedly sync
. It works by locking the global registry, then checking if any thread is in a critical section, and sleeping for short periods of time in case there are. The overhead associated to sync
is the main reason why critical sections should be short, and also why finalizable
objects are accumulated instead of being reclaimed right away.
#include <xrcu/optional.hpp>
Before moving on to the topic of containers, we need to take a look at an auxiliary structure implemented to make things easier and more convenient for users. This is the optional
type.
An optional is a template type that can hold either an object of type T, or be in an uninitialized state. In other words, the value may or may not be present. Unlike the usage of a pointer which may be null, an optional
never performs dynamic memory allocation, as the space required to hold the value is always there, even if it's not being used.
Optional types have been standardized in C++17, and are present in many other libraries as well. However, since XRCU aims to work with the base minimum of C++11 compliant compilers, and because it's designed not to depend on other libraries or frameworks, it contains its own (lightweight) implementation.
An optional is defined as such:
template <class T>
struct optional
{
};
And its public interface is the following:
Default constructor. Initializes the optional without a value.
Initializes the optional to contain value
.
Copy constructor. If other
has a value, then it initializes the optional to contain that same value. Otherwise, the optional will contain no value.
Move constructor. Initializes the optional by taking ownership of value
.
Move constructor. If other
has a value, then the optional is initialized by taking ownership of it. Otherwise, the optional will have no value.
Returns a (possibly const) reference to the optional's value. The results are undefined if the optional does not contain one.
Returns a (possibly const) pointer to the optional's value. The results are undefined if the optional does not contain one.
Returns true if the optional has a value.
If the optional contains a value, call its destructor and make the optional contain no value afterwards. Otherwise, there are no effects.
If other
has a value, assigns it to the optional. Otherwise, calls reset
on the optional and leaves it without a value. Returns *this
.
Assigns value
to the optional. Returns *this
.
If other
has a value, assigns it to the optional by moving it. Otherwise, calls reset
on the optional and leaves it without a value. Returns *this
.
Assigns value
to the optional by moving it. Returns *this
.
Destroys the value associated to the optional, if it had any.
Optionals are rather simple: They are implemented by using a flat buffer on which placement new is called. Once an object has been constructed, a pointer is cached for the buffer. This pointer will be used when fetching the value, or destroying the object (it's null for empty optionals).
Optionals are very useful when performing lookups in mapped containers, since they allow us to bypass the need of additional output parameters to determine if a search was successful in an atomic way.
Another reason as to why optional values are used so extensively in XRCU is because we consider them to be the best way to signal both a valid object, as well as an error. When using lock-less data structures, it's impossible to determine beforehand whether an error will occur when calling a method (because multithreading makes things non deterministic). Throwing an exception was considered a bit too "punishing" - it's not precisely a programming error, after all.
#include <xrcu/stack.hpp>
Stacks are the simplest of the lock-free data structures: They represent a basic LIFO container on which you can push and pop items at the back of the stack. Although their functionality is a bit more limited than other structures, they are still very useful to implement things like atomic free lists.
In XRCU, stacks meet the requirements for the C++ concept of Container.
Stacks are templated types that can be instantiated with a type T:
template <class T>
struct stack
{
typedef T value_type;
typedef T& reference;
typedef const T& const_reference;
typedef T* pointer;
typedef const T* const_pointer;
typedef ptrdiff_t difference_type;
typedef size_t size_type;
struct iterator;
struct const_iterator;
};
In C++'s parlance, a stack iterator is a forward iterator, and it may be used in any function that accepts them.
A stack iterator is a cs_guard
, which means that an implicit critical section is entered when one is created, and is extended until its destruction.
The following describes the public interface for stack<T>
Default constructor. Initializes a stack to be empty.
Initializes a stack to contain n
times value
.
Initializes a stack with the values in the range [first
, last
)
Initializes a stack with the values in lst
.
Copy constructor. Initializes a stack with the values in other
.
Move constructor. Takes ownership of the values in other
.
Pushes value
to the top of the stack.
Pushes the values in the range [first
, last
) to the stack.
Pushes n
times value
to the stack.
Same as push
, only it constructs a new item by calling the move constructor with args...
.
Removes the item at the top of the stack, and returns an optional with that value. If the stack is empty, the method returns an optional with no value.
Fetches the current item at the top of the stack, and returns an optional with it as its value. If the stack is empty, returns an optional with no value.
Returns the current size of the stack.
Returns the maximum allowed size for the stack.
Returns true if the stack is empty (equivalent to size () == 0
).
Swaps the contents of the stack with other
, but only if other
is not the same object.
Assigns the contents of other
to the stack and returns *this
.
Move assignment. Takes ownership of the elements in other
. Returns *this
.
Compares the elements of the stack with the ones in other
. Returns true if they are all equal.
Compares the elements of the stack with the ones in other
. Returns true if any two of them are not equal.
Lexicographically compares the elements of the stack with the ones in other
, in a way that is equivalent to calling std::lexicographical_compare
.
Removes every element from the stack.
Assigns to the stack the elements described by (x
, y
). They could be an iterator range, or a pair of (integer, value), as is the case with the stack's constructor.
Assigns to the stack the items in lst
.
Returns an iterator to the first element of the stack.
Returns an iterator one past the last element of the stack.
Default constructor for stack iterators. Leaves the object in an invalid state. Dereferencing or incrementing such an iterator has undefined behavior.
Copy constructor. Initializes the iterator to be equal to other
.
Returns a (possibly const) reference to the iterator's underlying object.
Returns a (possibly const) pointer to the iterator's underlying object.
Pre-increment operator. Moves the iterator forward, and returns it. If the iterator was at the end of the stack, the results are undefined.
Post-increment operator. Moves the iterator forward, and returns an iterator equal to what it was before incrementing. If the iterator was at the end of the stack, the results are undefined.
Tests for iterator (in)equality.
There really isn't much to say about the internal details of the stack. It's probably the easiest lock free structure to implement. As with most other implementations, the one in XRCU simply consists of an atomic pointer to the top node. The use of RCU prevents the biggest issue with this design, that is, the ABA problem.
The only noteworthy thing to point out is that the swap
method is safe to call from multiple threads as well. In order to achive this, the library uses a special sentinel bit that is temporarily set at the head node when a swap is undergoing. During a swap, push
and pop
cannot proceed, since they check that this special bit is not set before modifying the stack.
Note that because stack iterators are implicit critical sections, and because of stacks' implementation, iterating a stack will always be safe, even in the presence of operations like push
and pop
. The only way for an iterator to be invalidated is it's at the beggining of the stack, and a call to pop
is made. Even then, dereferencing the iterator is valid, but advancing it will end prematurely, since the object was unlinked from the stack.
#include <xrcu/queue.hpp>
This is a multi-producer, multi-consumer, FIFO queue. Much like the stack, elements can be pushed or popped from it, but the order is different: They will be retrieved in the same order they were inserted.
In XRCU, queues meet the requirements for the C++ concept of Container.
Queues are templated types, and they may be instantiated with a type T:
template <class T>
struct queue
{
typedef T value_type;
typedef T& reference;
typedef const T& const_reference;
typedef T* pointer;
typedef const T* const_pointer;
typedef ptrdiff_t difference_type;
typedef size_t size_type;
struct iterator;
struct const_iterator;
};
As with the stack, a queue's iterator can be considered a forward iterator, and it's also a cs_guard
, so that a queue may be examined by an iterator, at any time.
The following describes the public interface for queue<T>
Default constructor. Initializes a queue to be empty.
Initializes a queue to contain n
times value
.
Initializes a queue with the values in the range [first
, last
).
Initializes a queue with the values in lst
.
Copy constructor. Initializes a queue with the values in other
.
Move constructor. Takes ownership of the values in other
.
Pushes value
to the top of the queue.
Same as push
, only the new item is constructed by calling the move constructor with args...
.
Removes the first item from the queue, and returns an optional with that value. If the queue is empty, the method returns an optional with no value.
Returns an optional with the first value from the queue. If the queue is empty, an optional with no value is returned instead.
Returns an optional with the last value from the queue. If the queue is empty, an optional with no value is returned instead.
Returns the size of the queue.
Returns the maximum allowed size for the queue.
Swaps the contents of the queue with other
, but only if other
is not the same object.
Assigns the contents of other
to the queue and returns *this
.
Move assignment. Takes ownership of the elements in other
. Returns *this
.
Compares the elements of the queue with the ones in other
. Returns true if they are all equal.
Compares the elements of the queue with the ones in other
. Returns true if any two of them are not equal.
Lexicographically compares the elements of the queue with the ones in other
, in a way that is equivalent to calling std::lexicographical_compare
.
Removes every element from the queue.
Assigns to the queue the elements described by (x
, y
). They can be an iterator range or a pair of (integer, value), as is the case with the queue's constructor.
Assigns to the queue the elements in lst
.
Returns an iterator to the first element of the queue.
Returns an iterator one past the last element of the queue.
Default constructor for queue iterators. Leaves the object in an invalid state. Dereferencing or incrementing such an iterator has undefined behavior.
Copy constructor. Initializes the iterator to be equal to other
.
Returns a const reference to the iterator's underlying object.
Pre-increment operator. Moves the iterator forward, and returns it. If the iterator was at the end of the stack, the results are undefined.
Post-increment operator. Moves the iterator forward, and returns an iterator equal to what it was before incrementing. If the iterator was at the end of the stack, the results are undefined.
Tests for iterator (in)equality.
The design of the multi-producer, multi-consumer queue in XRCU is rather simple and different from other implementations. It consists of a single array that holds either the elements themselvers, or pointers to them, a decision taken at compile time via type traits.
In addition to the vector, the queue is composed of three additional counters: the capacity, a read cursor and a write cursor. The capacity is static, and simply stores the raw number of elements that the queue can hold before it's forced to reallocate. The read and write cursors are indices that indicate at which positions the pop
and push
(or emplace
) operations may take place.
When an element is pushed into a queue, the write cursor is checked against the capacity. If the queue is full, then it must be reallocated. This is done by marking each element with a special bit, allocating a new queue and then copying over the elements. If there's room in the queue, the element is atomically set via compare-and-swap, and if successful, the write cursor is also updated atomically.
Popping elements proceeds in a similar way. The read cursor is compared against the write cursor, and returns prematurely with failure if there are no more elements to pop. Otherwise, it's atomically replaced with a special marker, after which the read cursor is updated.
#include <xrcu/skip_list.hpp>
A skip list is an associative container that holds a sorted set of unique objects of a particular type. In addition to the Key type, skip lists are instantiated with a comparator type that allows them to determine ordering.
In XRCU, skip lists meet the requirements for the C++ concept of Container, and AssociativeContainer.
Skip lists are templated types, defined in the following way:
template <class T, class Cmp = std::less<T> >
struct skip_list
{
typedef T value_type;
typedef T key_type;
typedef Cmp key_compare;
typedef Cmp value_compare;
typedef T& reference;
typedef const T& const_reference;
typedef T* pointer;
typedef const T* const_pointer;
typedef ptrdiff_t difference_type;
typedef size_t size_type;
struct iterator;
struct const_iterator;
};
The template parameter T
refers to the key type, whereas Cmp
is the comparator type, which defaults to std::less
. Under most circumstances, that is usually enough, but users may instantiate with any other type that defines the operator()
which returns a boolean.
As shown above, skip_list has iterators, but they are always constant, for practicality reasons. Much like with other containers, skip list iterators are forward iterators, and an implicit cs_guard
.
The following describes the public interface for skip_list<T, Cmp>
Initializes the skip list with comparator c
, which defaults to a default constructed value. Also takes a parameter indicating the maximum depth a skip list node may have, which defaults to an implementation-specified value. Users shouldn't need to change the latter, but it can be useful when tuning the application for performance. As a general rule, a higher value implies more memory usage, but better performance.
Initializes the skip list to the values in [first
, last
). The comparator and depth parameters are the same as explained above.
Initializes the skip list to the values in lst
. The comparator and depth parameters are the same as explained above.
Copy constructor. Initializes the skip list to hold the values in other
.
Move constructor. Takes ownership of the values in other
.
Searches for an element equivalent to key
in the skip list, and returns an optional with it as its value. If the key couldn't be found, the returned optional has no value.
Returns true if key
is present in the skip list.
Inserts key
in the skip list. Returns true if the key wasn't present previous to this call.
Erases key
from the skip list. Returns true if the key was present previous to this call.
Erases key
from the skip list and returns an optional with that element as its value, if the key was present. Otherwise, the returned optional has no value.
Returns an iterator to the first element of the skip list.
Returns an iterator one past the last element of the skip list.
Returns the current size of the skip list.
Returns the maximum allowed size for a skip list.
Returns true if the skip list is empty (i.e: its size is 0).
Assigns to the skip list the elements in [first
, last
).
Assigns to the skip list the elements in lst
.
Assigns the elements in other
to the skip list. Returns *this
.
Move assignment. Takes ownership of the elements in other
. Returns *this
.
Swaps the contents of the skip list with other
, but only if other
is a different object.
Removes every element from the skip list.
In XRCU, skip lists are implemented as described in any piece of literature that talks about them. Basically, every skip list of depth D has a head node that can be linked with up to other D nodes. When performing a lookup for an element, we start at the head node, and move horizontally until the current element is equal or greater. If it's equal, we the lookup succeeded. Otherwise, we move vertically to the next node, until we either find the element, or we exhausted every node.
Insertions first perform a lookup, and if the search failed, we create a new node with the specified value as its key. The number of linked nodes it will have is determined in a semi-random way, but it must never exceed the maximum depth of the skip list. After the node is created, we try to atomically link it with its computed predecesors and successors (Returned in the same lookup call we did before). If we succeed, the node is considered part of the skip list; otherwise, we free it and retry.
Erasing an element proceeds in a similar fashion, only we check that the lookup did not fail, and then atomically relink the node's predecessors with the node's own successors, thereby removing the key from the skip list.
Skip lists mantain an atomic word to store the length. However, this is used slightly differently as one would think, because the lowest bit is used as an ad-hoc lock bit in order to serialize concurrent calls to swap
.
#include <xrcu/hash_table.hpp>
Hash tables are associative containers that map unique keys to values. Ordering is unspecified for both keys and values. In addition to the key and value types, hash tables are instantiated with a hashing type and an equality type; callables that compute a hash value for a given key, and one that tests for equality, given two keys, respectively.
Additionally, hash tables mantain a load factor that determines when a full rehash is performed. A rehash implies moving all the key-value pairs into a new location to reduce the overhead of most operations.
In XRCU, hash tables meet the requirements for the C++ concept of Container and UnorderedAssociativeContainer.
As mentioned above, hash tables are template types, defined like this:
template <class Key, class Val,
class Equal = std::equal<Key>,
class Hash = std::hash<Key> >
struct hash_table
{
typedef Val mapped_type;
typedef Key key_type;
typedef std::pair<Key, Val> value_type;
typedef Equal key_equal;
typedef Hash hasher;
typedef value_type& reference;
typedef const value_type& const_reference;
typedef value_type* pointer;
typedef const value_type* const_pointer;
typedef ptrdiff_t difference_type;
typedef size_t size_type;
struct iterator;
struct const_iterator;
};
The template parameters should be pretty self explanatory: They refer to the key, value, equality and hashing types, in that order. The Equal
type has to operate on two keys and return a boolean value that determines their equality, whereas the Hash
type has to operate on keys and return unsigned integers. Both are allowed to throw exceptions, although it is not really wise to do so.
Much like with skip lists, hash table iterators are always constant, and also an implicit cs_guard
.
The following describes public interface for hash tables:
Initializes the hash table to hold size
elements, with a load factor of lf
, using the equality predicate e
and the hasher h
. The load factor must be in the range [0.4, 0.9]; if it's not, then it will silently be set to the default of 0.85.
Initializes the hash table with the values between (first
, last
]. The elements in that range must have two public members defined: first
and second
, referring to the key and value of each element, respectively, and in a similar fashion to what std::pair
does. The other parameters work as with the previous constructor.
Initializes the hash table with the values in lst
. The rest of the parameters work as with the previous constructors.
Copy constructor. Initializes the hash table with the values in other
.
Move constructor. Takes ownership of the values in other
.
Returns the hash table size.
Returns the maximum allowed size for a hash table.
Returns true if the hash table is empty.
Look for key
in the hash table. If it's found, returns the value associated to it. If it's not found, returns an empty optional in the first version, or defl
in the second one.
Returns true if the key is present in the hash table.
Associates key
with val
in the hash table. Returns true if the value was not present. Otherwise, the former value is replaced.
Updates the value associated to key
by calling f
with it and the rest of the arguments. In other words, if val
is the value associated to key
, calls f (val, args...)
and updates the value with the result. If no value was present for key
, the function is called with a default constructed value instead (as if by calling Val ()
).
Note that the function takes a reference to the value, and so it's possible to modify it in place. Additionally, if the function returns the very same object that was passed (i.e: the same reference that it received), there won't be any modifications made to the value in the table. If, however, the function returns a different object, this call will need to atomically update the value. The passed function may be called more than once if the atomic updates fail.
Returns true if key
was not present in the hash table before the call.
Erases key
from the hash table. Returns true if key
was present before the call.
Removes key
from the hash table and returns the value that was associated to if, if it was present.
Removes every element from the hash table.
Assigns the elements in [first
, last
) to the hash table. The same restrictions as the range constructor apply here.
Assigns the elements in lst
to the hash table.
Assigns the elements in other
to the hash table. Returns *this
.
Move assignment. Takes ownership of the elements in other
. Returns *this
.
Swaps the contents of the hash table with other
, but only if other
is a different object.
Returns an iterator to the first element of the hash table.
Returns an iterator one past the last element of the hash table.
Default constructor for hash table iterators. Leaves the object in an invalid state. Dereferencing or incrementing such an iterator has undefined behavior.
Copy constructor. Initializes the iterator to be equal to other
.
Returns the key for the iterator.
Returns the value for the iterator.
Returns an std::pair
constructed with the key and value of the iterator.
Pre-increment operator. Moves the iterator forward and returns it. If the iterator was at the end, the results are undefined.
Post-increment operator. Moves the iterator forward, and returns an iterator equal to what it was before incrementing. If the iterator was at the end, the results are undefined.
Test for iterator (in)equality.
Hash tables are somewhat complex, because the atomicity requirements force us to do some rather convoluted things. To start off, a hash table is essentially a vector of consecutive key
and value
pairs, with some special values indicating free
and deleted
entries. However, since we can only operate atomically on integers, we wrap any other type that is not integral into a dynamically allocated pointer. This is done based on the template instantiation and is figured out at compile time.
When a lookup is performed, the passed key
is hashed with the table's own internal hasher and a hash code is computed. Using this code, we produce an index into the vector and then we iterate over it, looking for a key that matches, or until we find a free entry, in which case the lookup fails.
Insertions are a bit more complicated. We start by performing a lookup, as described above, and then determine what to do based on the result: If the lookup came up empty, then we need to insert both the key and the value; otherwise, we only need to update the value.
If only the value needs to be updated (because the key is already present), then the insertion simply consists of atomically swapping the old value with the new one, and, if succesful, finalizing the old value as well.
If we need to insert the key as well, first we decrement an internal counter that keeps track of how many additional elements we can insert before a rehash is triggered. If the counter has already reached zero, then we have to rehash before inserting the key and value. If no rehash is needed, or after it's done, both the key and value are updated atomically (Some platforms allow us to do it in a single step, otherwise the operations are done sequentially).
At this point, we have to pause to make a sort of confession: Hash tables as implemented in XRCU are not entirely lock free, because rehashing actually takes an internal lock (I know, you're crushed). This not because of any intrinsical limitation, but rather, to make things easier. Since rehashes are probably the least common of the operations, we figured it wasn't that big of a deal anyways.
Going back to rehashes, these are the most expensive operations. As mentioned, they start off by acquiring an internal lock, and then they iterate over the full vector, marking the value
indexes with a special bit that signals that the table is being rehashed. When this bit is set, insertions and erasures are forbidden (They both check for this bit before proceeding).
Erasures are pretty simple in comparison. They obviously perform a lookup on the key, and if it didn't come up empty, they atomically swap out the value for the special empty
constant. Afterwards, they can mutate the key entry without atomicity (Because erased entries cannot be reused).
All implemented containers use standard operators new
and delete
to perform memory (de)allocations. There's no way to specify custom allocators yet, although it's planned in the future.
Agustina Arzille (avarzille@riseup.net) and Luciano Lo Giudice (lmlogiudice@gmail.com)
This will delete the page "Home"
. Please be certain.