Skip to content

ocaml-multicore/saturn

Repository files navigation

API Reference · Benchmarks · Stdlib Benchmarks

Saturn — Parallelism-Safe Data Structures for Multicore OCaml

This repository is a collection of concurrent-safe data structures for OCaml 5. It aims to provide an industrial-strength, well-tested (and possibly model-checked and verified in the future), well documented, and maintained concurrent-safe data structure library. We want to make it easier for Multicore OCaml users to find the right data structures for their uses.

You can learn more about the motivation behind Saturn through the implementation of a lock-free stack here.

Saturn is published on opam and is distributed under the ISC license.

OCaml-CI Build Status GitHub release (latest by date) docs

Contents

Installation

Getting OCaml 5.2.0

To use Saturn, you need OCaml 5.2.0 or later. While Saturn is compatible with OCaml 4.14, this is primarily for compatibility purposes, as parallelism-safe data structures are not required without OCaml 5. Note that versions of OCaml 5 prior to 5.2 are not supported due to bugs in the Atomic module that affect the functionality of some data structures.

To install OCaml 5.2.0 yourself, first make sure you have opam 2.1 or later. You can run this command to check:

opam --version

Then use opam to install OCaml 5.2.0:

opam switch create 5.2.0

If you want a later version, you can run the following line to get a list of all available compiler versions:

opam switch list-available

Getting Saturn

saturn can be installed from opam:

opam install saturn

Provided data structures

Treiber Lock-free Stack

  • Module: Stack
  • Description: A classic multi-producer, multi-consumer, lock-free stack, known for robustness and flexibility.
  • Recommendation: It's a recommended starting point when a LIFO structure is needed.

Lock-free Bounded Stack

  • Module: Bounded_stack
  • Description: A stack based on the Treiber stack algorithm, with a limited capacity and a length function. This ensures that the stack is memory-bounded.
  • Recommendation: Adding a capacity introduces a general overhead to the operations. It is recommended to use the unbounded stack if neither the capacity nor the length function is needed.

Michael-Scott Lock-free Queue

Lock-free Bounded Queue

  • Module: Bounded_queue
  • Description: A queue based on the Michael-Scott queue algorithm, with a limited capacity and a length function. This ensures that the queue is memory-bounded.
  • Recommendation: Adding a capacity introduces a general overhead to the operations. It is recommended to use the unbounded queue if neither the capacity nor the length function is needed.

Lock-free Chase-Lev Work-Stealing Dequeue

Lock-free Single Producer Single Consumer Queue

  • Module: Single_prod_single_cons_queue
  • Description: A single-producer, single-consumer fixed-size queue. This specific configuration enables strong optimizations but also makes the data structure unsafe if used improperly, i.e., with more than one producer or one consumer at any time.
  • Recommendation: It's concurrent-safe as long as only one thread acts as producer and one as consumer at any time.

Lock-free Multiple Producers Single Consumer Queue

  • Module: Single_consumer_queue
  • Description: A multi-producer, single-consumer concurrent-safe queue with a closing mechanism to prevent further pushes.
  • Recommendation: Designed for scheduler run queues. It is not concurrent-safe if used by multiple consumers simultaneously.

Lock-free Skip List

  • Module: Skiplist
  • Description: A skiplist is a probabilistic data structure that has an average logarithmic complexity for search and insertion operations. Like Stdlib.Map, it is an ordered collection.
  • Recommendation: The skiplist is not resizable. It will, however, continue to work once the limit capacity is reached, but performance will decrease as the depth of the structure won't be enough to maintain logarithmic performance.
  • Sources: See Chapter 14 in The Art of Multiprocessor Programming

Lock-free Hash Table

  • Module: Htbl
  • Description: A resizable lock-free hash table with a snapshot mechanism.
  • Recommendation: Contains useful high-level operations designed to work as building blocks of non-blocking algorithms.

Lock-free Bag

  • Module: Bag
  • Description: A resizable lock-free bag based on the hash table. The pop functions returns a random value contained on the bag.

About the Unsafe Data Structures

Some data structures are available in two versions: a normal version and a more optimized but unsafe version. The unsafe version utilizes Obj.magic in a way that may be unsafe with flambda2 optimizations.

The reason for providing the unsafe version is that certain optimizations require features that are currently not available in OCaml, such as arrays of atomics or atomic fields in records. We recommend using the normal version of a data structure unless its performance is not sufficient for your use case. In that case, you can try the unsafe version.

Currently, the following data structures have an unsafe version:

  • Single_cons_single_prod_unsafe: a single consumer single producer lock-free queue
  • Queue_unsafe: a Michael-Scott lock-free queue
  • Bounded_queue_unsafe: a lock-free bounded queue based on Michael-Scott queue algorithm
  • Htbl_unsafe : a lock-free hashtable

Usage

This part describes how to use the provided data structures, and more exactly, what not to do with them. Two main points are discussed:

  • some data structures have restrictions on what operations can be performed in a single domain or a set of domains
  • the currently provided data structures are non-composable

Data Structures with Domain Roles

Some provided data structures are designed to work with specific domain configurations. These restrictions optimize their implementation, but failing to respect them may compromise safety properties. These limitations are clearly indicated in the documentation and often reflected in the name of the data structure itself. For instance, a single-consumer queue must have only one domain performing pop operations at any given time.

To learn more about it, see this document.

Composability

Composability refers to the ability to combine functions while preserving their properties. For Saturn data structures, the expected properties include atomic consistency (or linearizability) and progress guarantees, such as lock-freedom. Unfortunately, Saturn's data structures are not composable.

To learn more about it, see this document.

Testing

One of the many difficulties of implementating parallelism-safe data structures is that in addition to providing the same safety properties as sequental ones, they may also have to observe some liveness properties as well as additional safety properties specific to concurrent programming, like deadlock-freedom.

In addition to the expected safety properties, the main properties we want to test for are:

  • linearisability,
  • lock-freedom for all the lock-free data structures,
  • no potentially harmful data races.

Here is a list of the tools we use to ensure them:

  • safety : unitary tests and qcheck tests check semantics and expected behaviors with one and more domains.
  • safety and liveness : STM tests check linearisability for two domains (see multicoretests library).
  • liveness : dscheck checks non-blocking property for as many domains as wanted (for two domains most of the time). See dscheck.
  • safety : no data race with tsan

See test/README.md for more details.

Benchmarks

There are a number of benchmarks in bench directory. You can run them with make bench. See bench/README.md for more details.

Contributing

Contributions are appreciated! If you intend to add a new data structure, please read this before.