thesis/README.md

# About

Thesis title: using template metaprogramming to design active libraries for assisted parallelisation.

(French: *application de la métaprogrammation template à la conception de bibliothèques actives
de parallélisation assitée*)

## Download

- [Computer version](https://phd.pereda.fr/assets/thesis/alexis_pereda_thesis.pdf);
- [Print version](https://phd.pereda.fr/assets/thesis/alexis_pereda_thesis_print.pdf).

## Abstract

<div align="justify">
Hardware performance has been increasing through the addition of computing cores rather than through
increasing their frequency since the early 2000s.
This means that parallel programming is no longer optional should you need to make the best use of
the hardware at your disposal.
Yet many programs are still written sequentially: parallel programming introduces numerous
difficulties.
Amongst these, it is notably hard to determine whether a sequence of a program can be executed in
parallel, i.e. preserving its behaviour as well as its overall result.
Even knowing that it is possible to parallelise a piece of code, doing it correctly is another
problem.
In this thesis, we present two approaches to make writing parallel software easier.

We present an active library (using C++ template metaprogramming to operate during the compilation
process) whose purpose is to analyse and parallelise loops.
To do so, it builds a representation of each processed loop using expression templates through an
embedded language.
This allows to know which variables are used and how they are used.
For the case of arrays, which are common within loops, it also acquires the index functions.
The analysis of this information enables the library to identify which instructions in the loop can
be run in parallel.
Interdependent instructions are detected by knowing the variables and their access mode for each
instruction.
Given a group of interdependent instructions and the known index functions, the library decides if
the instructions can be run in parallel or not.
We want this library to help developers writing loops that will be automatically parallelised
whenever possible and run sequentially as without the library otherwise.
Another focus is to provide this to serve as a framework to integrate new methods for parallelising
programs and extended analysis rules.

We introduce another active library that aims to help developers by assisting them in writing
parallel software instead of fully automating it.
This library uses algorithmic skeletons to let the developer describe its algorithms with both its
sequential and parallel parts by assembling atomic execution patterns such as a series of tasks or a
parallel execution of a repeated task.
This description includes the data flow, that is how parameters and function returns are
transmitted.
Usually, this is automatically set by the algorithmic skeleton library, however it gives the
developer greater flexibility and it makes it possible, amongst other things, for our library to
automatically transmit special parameters that must not be shared between parallel tasks.
One feature that this allows is to ensure repeatability from one execution to another even for
stochastic algorithms.
Considering the distribution of tasks on the different cores, we even reduce the number of these
non-shared parameters.
Once again, this library provides a framework at several levels.
Low-level extensions consist of the implementation of new execution patterns to be used to build
skeletons.
Another low-level axis is the integration of new execution policies that decide how tasks are
distributed on the available computing cores.
High-level additions will be libraries using ours to offer ready-to-use algorithmic skeletons for
various fields.
</div>

Keywords: template metaprogramming; assisted parallelisation; automatic parallelisation; active libraries; algorithmic skeletons; repeatability.

## Related publications

- "Repeatability with Random Numbers Using Algorithmic Skeletons", ESM 2020 (https://hal.archives-ouvertes.fr/hal-02980472);
- "Modeling Algorithmic Skeletons for Automatic Parallelization Using Template Metaprogramming", HPCS 2019 (IEEE) [10.1109/HPCS48598.2019.9188128](https://doi.org/10.1109/HPCS48598.2019.9188128);
- "Processing Algorithmic Skeletons at Compile-Time", ROADEF 2020 (https://hal.archives-ouvertes.fr/hal-02573660);
- "Algorithmic Skeletons Using Template Metaprogramming", ICAST 2019;
- "Parallel Algorithmic Skeletons for Metaheuristics", ROADEF 2019 (https://hal.archives-ouvertes.fr/hal-02059533);
- "Static Loop Parallelization Decision Using Template Metaprogramming", HPCS 2018 (IEEE) [10.1109/HPCS.2018.00159](https://doi.org/10.1109/HPCS.2018.00159).

## Related projects

- [AlSk](https://phd.pereda.fr/dev/alsk), an algorithmic skeletons active library;
- [pfor](https://phd.pereda.fr/dev/pfor), an automatic parallelisation active library;
- [ROSA](https://phd.pereda.fr/dev/rosa), an algorithmic skeletons collection for [OR](https://en.wikipedia.org/wiki/Operations_research) algorithms;
- [TMP](https://phd.pereda.fr/dev/tmp), template metaprogramming library used to implement this library.

## Usage

To produce the `Makefile`:
```bash
mkdir build
cd build
cmake ..
```

Compilation has been tested with `texlive-full` version 2020.20210202-3.

To build the project:
```
make
```

Be patient, it takes *some* time.

Make can be run with these arguments:
- `pdf_thesis_oneside`: to build the "computer" version, including dynamic figures (default);
- `pdf_thesis_twoside`: same as oneside but better for double-page display;
- `pdf_thesis_print`: printing version (no dynamic figures, double-page and blank pages where required).

PDF files are generated in `build/pdf/`.
thesis version 2021-10-06 21:08:28 +02:00			`# About`

			`Thesis title: using template metaprogramming to design active libraries for assisted parallelisation.`

			`(French: *application de la métaprogrammation template à la conception de bibliothèques actives`
			`de parallélisation assitée*)`

			`## Download`

			`- [Computer version](https://phd.pereda.fr/assets/thesis/alexis_pereda_thesis.pdf);`
			`- [Print version](https://phd.pereda.fr/assets/thesis/alexis_pereda_thesis_print.pdf).`

			`## Abstract`

			`<div align="justify">`
			`Hardware performance has been increasing through the addition of computing cores rather than through`
			`increasing their frequency since the early 2000s.`
			`This means that parallel programming is no longer optional should you need to make the best use of`
			`the hardware at your disposal.`
			`Yet many programs are still written sequentially: parallel programming introduces numerous`
			`difficulties.`
			`Amongst these, it is notably hard to determine whether a sequence of a program can be executed in`
			`parallel, i.e. preserving its behaviour as well as its overall result.`
			`Even knowing that it is possible to parallelise a piece of code, doing it correctly is another`
			`problem.`
			`In this thesis, we present two approaches to make writing parallel software easier.`

			`We present an active library (using C++ template metaprogramming to operate during the compilation`
			`process) whose purpose is to analyse and parallelise loops.`
			`To do so, it builds a representation of each processed loop using expression templates through an`
			`embedded language.`
			`This allows to know which variables are used and how they are used.`
			`For the case of arrays, which are common within loops, it also acquires the index functions.`
			`The analysis of this information enables the library to identify which instructions in the loop can`
			`be run in parallel.`
			`Interdependent instructions are detected by knowing the variables and their access mode for each`
			`instruction.`
			`Given a group of interdependent instructions and the known index functions, the library decides if`
			`the instructions can be run in parallel or not.`
			`We want this library to help developers writing loops that will be automatically parallelised`
			`whenever possible and run sequentially as without the library otherwise.`
			`Another focus is to provide this to serve as a framework to integrate new methods for parallelising`
			`programs and extended analysis rules.`

			`We introduce another active library that aims to help developers by assisting them in writing`
			`parallel software instead of fully automating it.`
			`This library uses algorithmic skeletons to let the developer describe its algorithms with both its`
			`sequential and parallel parts by assembling atomic execution patterns such as a series of tasks or a`
			`parallel execution of a repeated task.`
			`This description includes the data flow, that is how parameters and function returns are`
			`transmitted.`
			`Usually, this is automatically set by the algorithmic skeleton library, however it gives the`
			`developer greater flexibility and it makes it possible, amongst other things, for our library to`
			`automatically transmit special parameters that must not be shared between parallel tasks.`
			`One feature that this allows is to ensure repeatability from one execution to another even for`
			`stochastic algorithms.`
			`Considering the distribution of tasks on the different cores, we even reduce the number of these`
			`non-shared parameters.`
			`Once again, this library provides a framework at several levels.`
			`Low-level extensions consist of the implementation of new execution patterns to be used to build`
			`skeletons.`
			`Another low-level axis is the integration of new execution policies that decide how tasks are`
			`distributed on the available computing cores.`
			`High-level additions will be libraries using ours to offer ready-to-use algorithmic skeletons for`
			`various fields.`
			`</div>`

			`Keywords: template metaprogramming; assisted parallelisation; automatic parallelisation; active libraries; algorithmic skeletons; repeatability.`

			`## Related publications`

			`- "Repeatability with Random Numbers Using Algorithmic Skeletons", ESM 2020 (https://hal.archives-ouvertes.fr/hal-02980472);`
			`- "Modeling Algorithmic Skeletons for Automatic Parallelization Using Template Metaprogramming", HPCS 2019 (IEEE) [10.1109/HPCS48598.2019.9188128](https://doi.org/10.1109/HPCS48598.2019.9188128);`
			`- "Processing Algorithmic Skeletons at Compile-Time", ROADEF 2020 (https://hal.archives-ouvertes.fr/hal-02573660);`
			`- "Algorithmic Skeletons Using Template Metaprogramming", ICAST 2019;`
			`- "Parallel Algorithmic Skeletons for Metaheuristics", ROADEF 2019 (https://hal.archives-ouvertes.fr/hal-02059533);`
			`- "Static Loop Parallelization Decision Using Template Metaprogramming", HPCS 2018 (IEEE) [10.1109/HPCS.2018.00159](https://doi.org/10.1109/HPCS.2018.00159).`

			`## Related projects`

			`- [AlSk](https://phd.pereda.fr/dev/alsk), an algorithmic skeletons active library;`
			`- [pfor](https://phd.pereda.fr/dev/pfor), an automatic parallelisation active library;`
			`- [ROSA](https://phd.pereda.fr/dev/rosa), an algorithmic skeletons collection for [OR](https://en.wikipedia.org/wiki/Operations_research) algorithms;`
			`- [TMP](https://phd.pereda.fr/dev/tmp), template metaprogramming library used to implement this library.`

			`## Usage`

			To produce the `Makefile`:
			```bash
			`mkdir build`
			`cd build`
			`cmake ..`
			```

			Compilation has been tested with `texlive-full` version 2020.20210202-3.

			`To build the project:`
			```
			`make`
			```

			`Be patient, it takes some time.`

			`Make can be run with these arguments:`
			- `pdf_thesis_oneside`: to build the "computer" version, including dynamic figures (default);
			- `pdf_thesis_twoside`: same as oneside but better for double-page display;
			- `pdf_thesis_print`: printing version (no dynamic figures, double-page and blank pages where required).

			PDF files are generated in `build/pdf/`.