alsk/README.md

146 lines
5.1 KiB
Markdown
Raw Permalink Normal View History

2021-05-10 16:14:13 +00:00
# About
This is an active library, using C++ template metaprogramming, to do assisted parallelisation using algorithmic skeletons.
This work has been done for my Ph.D. thesis.
## Brief
It implements [algorithmic skeletons](https://en.wikipedia.org/wiki/Algorithmic_skeleton) to provide an abstraction
for developers to write parallel software.
It exposes:
- bones: atomic algorithmic structure to use to build skeletons;
- links: function signatures with placeholders to define data transfers between tasks;
- execution policies: a solution to select how tasks will be distributed.
After describing an algorithm, one can generate a functionoid that implements it and run it.
Using the links system in combination with the execution policy and overall structure knowledge,
the library provide a way to guarantee repeatability from one execution to another and even
for different number of allotted cores.
This is particularly useful for stochastic programs because of the use of pseudo-random numbers.
Main features:
- parallel implementation is hidden and separated from domain code;
- execution is controlled by the chosen execution policy;
- repeatability is automatically enabled.
## Related projects
- [ROSA](https://phd.pereda.fr/dev/rosa), an algorithmic skeletons collection for [OR](https://en.wikipedia.org/wiki/Operations_research) algorithms;
- [TMP](https://phd.pereda.fr/dev/tmp), template metaprogramming library used to implement this library.
See [ROSA](https://phd.pereda.fr/dev/rosa) for more complete and meaningful examples of algorithmic skeletons and
for the performances presented in the thesis.
## Example
The code below defines `Gen`, a integral sequence generator, starting from the value given
when constructed.
```cpp
struct Gen {
int value;
int operator()() { return value++; }
};
```
The next code defines `transform`, a function that modifies its input depending on some
pseudo-random number generated using the given generator.
```cpp
int transform(int v, std::mt19937& rng) {
std::uniform_int_distribution<int> d(-3, 3);
return v + d(rng);
}
```
The library exposes a raw interface to build skeletons from its structure and links definitions.
An example using the type `Gen`, the function `transform` and the standard function `std::min<int>` is shown below.
```cpp
using Structure =
S<FarmSel,
S<Serial, Gen, FN(transform)>,
Fn<int const&(&)(int const&, int const&), std::min<int>>
>;
using Links =
L<FarmSel, int(),
L<Serial, R<1>(),
int(),
int(R<0>, RNG)
>,
int(int, int)
>;
using Skeleton = BuildSkeletonT<Structure, Links>;
int main() {
auto algo = implement<StaticPool, Skeleton>();
algo.skeleton.n = 10;
algo.skeleton.task.task<0>() = Gen{5};
algo.executor.repeatability.upTo(8);
algo.executor.cores = 8;
auto r = algo();
std::printf("%d\n", r);
}
```
The same can be achieved using the EDSL provided by the library as below:
```cpp
int main() {
auto gen = alsk::edsl::makeOperand<int(), Gen>();
auto transform = alsk::edsl::makeOperand<int(alsk::arg::R<0>, alsk::arg::RNG), FN(::transform)>();
auto selectMin = alsk::edsl::makeOperand<int(int, int), Fn<int const&(&)(int const&, int const&), std::min<int>>>();
constexpr auto body = (10*alsk::edsl::link<alsk::arg::R<1>()>(gen, transform)) ->* selectMin;
auto algo = alsk::edsl::implement<alsk::exec::StaticPool>(body);
algo.skeleton.task.task<0>() = Gen{5};
algo.executor.repeatability.upTo(8);
algo.executor.cores = 8;
auto r = algo();
std::printf("%d\n", r);
}
```
Usually, the first interface is used to produce templates instead of types directly.
As the examples above show, the library can handle contextual arguments like random number generators to ensure
the repeatability of the execution (in this case, up to 8 cores as specified).
Additionally, the library optimises the number of PRNG to guarantee repeatability.
Without optimisation, the number of PRNG is linear because it must be equal to the number of tasks using a PRNG to run (red).
When the possible numbers of cores are from 1 to 64, it can be reduced (blue).
When the possible numbers of cores are powers of 2 up to 64, it becomes cyclic with a low maximum (orange).
<div align="center"><img src="https://phd.pereda.fr/assets/alsk/optimised_repeatability.png" width="500"></div>
## Related publications
- "Repeatability with Random Numbers Using Algorithmic Skeletons", ESM 2020 (https://hal.archives-ouvertes.fr/hal-02980472);
- "Modeling Algorithmic Skeletons for Automatic Parallelization Using Template Metaprogramming", HPCS 2019 (IEEE) [10.1109/HPCS48598.2019.9188128](https://doi.org/10.1109/HPCS48598.2019.9188128);
- "Processing Algorithmic Skeletons at Compile-Time", ROADEF 2020 (https://hal.archives-ouvertes.fr/hal-02573660);
- "Algorithmic Skeletons Using Template Metaprogramming", ICAST 2019;
- "Parallel Algorithmic Skeletons for Metaheuristics", ROADEF 2019 (https://hal.archives-ouvertes.fr/hal-02059533).
## Organisation
Main directories:
- `src/alsk`: the library sources;
- `examples`: some examples using the library.
## Usage
To produce the `Makefile` and build the project:
```bash
mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make
```
To run examples:
```bash
./build/examples/${example_name}
```