Termination detection for fine-grained message-passing architectures

Matthew Naylor¹, Simon W. Moore¹, Andrey Mokhov², David Thomas³, Jonathan R. Beaumont³, Shane Fleming⁴, A. Theodore Markettos¹, Thomas Bytheway¹ and Andrew Brown⁵

¹University of Cambridge, UK ²Newcastle University and Jane Street, UK ³Imperial College London, UK ⁴Microsoft Research, UK ⁵University of Southampton, UK

Abstract

Barrier primitives provided by standard parallel programming APIs are the primary means by which applications implement global synchronisation. Typically these primitives are fully-committed to synchronisation in the sense that, once a barrier is entered, synchronisation is the only way out. For message-passing applications, this raises the question of what happens when a message arrives at a thread that already resides in a barrier. Without a satisfactory answer, barriers do not interact with message-passing in any useful way. In this paper, we propose a new refutable barrier primitive that combines with message-passing to form a simple, expressive, efficient, well-defined API. It has a clear semantics based on termination detection, and supports the development of both globally-synchronous and asynchronous parallel applications. To evaluate the new primitive, we implement it in a prototype large-scale message-passing machine with 49,152 RISC-V threads distributed over 48 FPGAs. We show that hardware support for the primitive leads to a highly-efficient implementation, capable of synchronisation rates that are an order-of-magnitude higher than what is achievable in software. Using the primitive, we implement synchronous and asynchronous versions of a range of applications, observing that each version can have significant advantages over the other, depending on the application. Therefore, a barrier primitive supporting both styles can greatly assist the development of parallel programs.

Video

[The authors opted for not publicly sharing a presentation video.]

Termination detection for fine-grained message-passing architectures

Matthew Naylor1, Simon W. Moore1, Andrey Mokhov2, David Thomas3, Jonathan R. Beaumont3, Shane Fleming4, A. Theodore Markettos1, Thomas Bytheway1 and Andrew Brown5

1 University of Cambridge, UK 2 Newcastle University and Jane Street, UK 3 Imperial College London, UK 4 Microsoft Research, UK 5 University of Southampton, UK

Abstract

Video

[ Main Sponsor Sponsoring Society Award Sponsor ]

Matthew Naylor¹, Simon W. Moore¹, Andrey Mokhov², David Thomas³, Jonathan R. Beaumont³, Shane Fleming⁴, A. Theodore Markettos¹, Thomas Bytheway¹ and Andrew Brown⁵

¹University of Cambridge, UK ²Newcastle University and Jane Street, UK ³Imperial College London, UK ⁴Microsoft Research, UK ⁵University of Southampton, UK