Open Universiteit

Please use this identifier to cite or link to this item:
Title: Improving scalability of concurrency in Reo
Authors: van Veen, Bernie
Issue Date: 29-Aug-2017
Publisher: Open Universiteit Nederland
Abstract: With the introduction of multi-CPU systems and multi-core processors, the so called free performance lunch programmers had enjoyed, came to an end. Until that time, improvements in CPU clock speed, cache management, and execution management of new hardware generations, would automatically make existing sequential programs run faster. After the introduction of multi-core computers, however, parallel computing entered the world of the programmer. Without parallelism, it is not possible to fully use the processing power of multi-core computers. Programmers need to introduce parallelism in their programs to be able to fully use the computing power available. This is notoriously hard. Parallelism is accomplished by having separate workers run multiple parts of the program's computations concurrently. Implementing the interaction between the workers in a program is one of the most challenging aspects of parallel programming [2]. This interaction is performed according to speci c rules, collectively called a protocol. One promising approach to make parallel programming easier, is based on syntactically separating the interaction protocol from the actual computations of the program. The protocol is then speci ed in a protocol language and thus separated from the computational code. This results in a better separation of concerns. Examples of such protocol languages are BIP [4], Scribble [8] and Reo [1, 2]. Our work focuses on using Reo as a protocol language to generate parallel programs. Reo is a graphical language in which protocols are expressed as data- ow graphs, called circuits; Reo's semantics is formalized in terms of automata. One of the desirable attributes of any parallel program is scalability [6]. This attribute demands the program to be able to cope with increasing parallelism ofthe hardware it is running on. Foster expressed the importance of scalability in the following statement: a program able to use only a xed number of processors is a bad program [6]. Scalability is also an important aspect of software quality [9]. In a Reo circuit, however, the number of workers is xed at compile-time. This means parallel programs implemented using Reo will not be able to cope with increasing hardware parallelism. Thus, Reo does not support the scalability attribute, which is a serious limitation. A textual syntax, named Pr exists, which can be used to express a Reo circuit textually. Pr supports natural number parametrisation, allowing speci cation of a protocol with an adjustable number of constituents, referred to as a scalable protocol. When a scalable protocol is compiled to an executable program, the number of constituents must be known. Once the program is compiled, this number can not be changed anymore, so scalability is limited to compile-time only. We presented an approach to add scalability to a parallel program generated from a protocol language at load-time, while maintaining performance comparable to existing state-of-the-art compilation approaches for Reo. We broke this objective down into three research sub-questions: RQ-1: What approach would enable a Reo compiler to compile a program that allows changing the number of workers connected to a protocol at load-time? RQ-2: How can this approach be implemented in an actual Pr-to-Java compiler? RQ-3: Is such an implementation feasible in practice (i.e., does it provide acceptable performance)? We will now consider the answers to these research questions. An approach that allows changing the number of workers in the interaction protocol at load-time (RQ-1) consists of these steps (Section 4.1): 1. At compile-time, the automata for the primitives of the circuit are distributed into subsets by the compiler, using variables in the scalable protocol that do not get a value until at load-time (Section 4.2). The compiler multiplies the primitive automata of each subset into a single automaton, referred to as a constituent automaton. We illustrated this approach using the Pr protocol language to express scalability. However, our approach is not limited to Pr and can also be applied to other implementations based on automata. 2. At load-time, the required number of instances of each constituent automaton are created. Any input and output ports of the automaton, to which the workers connect, will then be replicated as well. The automaton instances are composed into a single automaton, referred to as the compound automaton. (Section 4.3). We implemented a Pr-to-Java compiler generating scalable Java parallel programs from a Pr speci cation (Section 5.1). In addition to adapting and implementing interpretation and compilation to generate the constituent automata, also the run time library was extended with functionality to compose the compound automaton. Thus, this is a successful implementation of improved scalability in a Pr-to-Java compiler generating scalable programs (RQ-2).3. At execution-time, the compound automaton runs and enacts interaction among workers according to the protocol (Section 4.4). The scalable compiler implements several approaches to compose and run the compound automaton, namely: (i) eager composition, which composes the compound automaton at load-time, (ii) lazy composition, extending the automaton state-by-state, composing a state only when a transition to that state res, (iii) on-the- y composition, which is equal to lazy composition but only retains a limited set of states in memory. To validate feasibility of our scalable approach, we performed tests to benchmark the scalable programs created by the scalable compiler against unscalable programs created by the existing state-of-the-art Reo compiler [10] (Section 5.2). For this, a large set of circuits was selected from literature, which covers all typical circuits. These tests show that the scalable programs have comparable performance in most situations, in the same order of magnitude as the xed programs. Moreover, the scalable compiler creates a more e cient implementation for circuits that contain Fifo channels. Finally, the scalable compiler is able to scale to larger numbers of workers for circuits with a large state space of which only a small number of states is actually reached. To investigate performance of the scalable circuits in real-life applications, experiments were also performed using the NAS Parallel benchmarks (NPB) [3]. The results showed that the amount of overhead relatively decreases as the amount of computations performed by the workers grows. This is similar for unscalable programs generated using the centralised approach. In cases where the amount of computations is large, overhead becomes negligible and the scalable program's performance is similar to the performance of the native Java implementation. (Section 5.3). This shows that our approach is feasible in practice (RQ-3) (Section 6). In this thesis, we presented an approach to generate a scalable parallel program from a protocol language, implemented using a formal theory of communicating automata (state machines). This approach provides scalability at load-time and was demonstrated to be able to generate parallel programs covering all typical Reo circuits. The experiments we performed show that our approach is feasible in practice. The performance of the scalable parallel programs is comparable to programs compiled from the same protocol speci cation using a state-of-the-art compiler for Reo without scalability features. Thus, we contribute a validated approach to add scalability to a parallel program generated from a protocol language at load-time, while maintaining the same level of performance as existing state-of-the-art compilation approaches. Our approach enables implementations of protocol languages to support the desired scalability attribute, which is an essential feature in modern-day computing. Recognising the importance of scalability in parallel programs and the feasibility of our approach, protocol languages, like Reo, are one step closer to being a viable alternative for the large-scale implementation of parallel programs
Appears in Collections:MSc Software Engineering

Files in This Item:
File Description SizeFormat 
Thesis H.B. van Veen.pdf4.18 MBAdobe PDFView/Open

This item is licensed under a Creative Commons License Creative Commons