You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
Alexis Pereda b688da651b
thesis version
2 years ago
celero thesis version 2 years ago
examples thesis version 2 years ago
inc thesis version 2 years ago
lib/celero thesis version 2 years ago
plot thesis version 2 years ago
scripts thesis version 2 years ago
src/alsk thesis version 2 years ago
tests thesis version 2 years ago
.gitignore thesis version 2 years ago
CMakeLists.txt thesis version 2 years ago
LICENSE thesis version 2 years ago
README.md thesis version 2 years ago

README.md

About

This is an active library, using C++ template metaprogramming, to do assisted parallelisation using algorithmic skeletons. This work has been done for my Ph.D. thesis.

Brief

It implements algorithmic skeletons to provide an abstraction for developers to write parallel software. It exposes:

  • bones: atomic algorithmic structure to use to build skeletons;
  • links: function signatures with placeholders to define data transfers between tasks;
  • execution policies: a solution to select how tasks will be distributed.

After describing an algorithm, one can generate a functionoid that implements it and run it. Using the links system in combination with the execution policy and overall structure knowledge, the library provide a way to guarantee repeatability from one execution to another and even for different number of allotted cores. This is particularly useful for stochastic programs because of the use of pseudo-random numbers.

Main features:

  • parallel implementation is hidden and separated from domain code;
  • execution is controlled by the chosen execution policy;
  • repeatability is automatically enabled.
  • ROSA, an algorithmic skeletons collection for OR algorithms;
  • TMP, template metaprogramming library used to implement this library.

See ROSA for more complete and meaningful examples of algorithmic skeletons and for the performances presented in the thesis.

Example

The code below defines Gen, a integral sequence generator, starting from the value given when constructed.

struct Gen {
	int value;
	int operator()() { return value++; }
};

The next code defines transform, a function that modifies its input depending on some pseudo-random number generated using the given generator.

int transform(int v, std::mt19937& rng) {
	std::uniform_int_distribution<int> d(-3, 3);
	return v + d(rng);
}

The library exposes a raw interface to build skeletons from its structure and links definitions. An example using the type Gen, the function transform and the standard function std::min<int> is shown below.

using Structure =
S<FarmSel,
	S<Serial, Gen, FN(transform)>,
	Fn<int const&(&)(int const&, int const&), std::min<int>>
>;

using Links =
L<FarmSel, int(),
	L<Serial, R<1>(),
		int(),
		int(R<0>, RNG)
	>,
	int(int, int)
>;

using Skeleton = BuildSkeletonT<Structure, Links>;

int main() {
	auto algo = implement<StaticPool, Skeleton>();
	algo.skeleton.n = 10;
	algo.skeleton.task.task<0>() = Gen{5};

	algo.executor.repeatability.upTo(8);
	algo.executor.cores = 8;

	auto r = algo();
	std::printf("%d\n", r);
}

The same can be achieved using the EDSL provided by the library as below:

int main() {
	auto gen = alsk::edsl::makeOperand<int(), Gen>();
	auto transform = alsk::edsl::makeOperand<int(alsk::arg::R<0>, alsk::arg::RNG), FN(::transform)>();
	auto selectMin = alsk::edsl::makeOperand<int(int, int), Fn<int const&(&)(int const&, int const&), std::min<int>>>();

	constexpr auto body = (10*alsk::edsl::link<alsk::arg::R<1>()>(gen, transform)) ->* selectMin;
	auto algo = alsk::edsl::implement<alsk::exec::StaticPool>(body);
	algo.skeleton.task.task<0>() = Gen{5};

	algo.executor.repeatability.upTo(8);
	algo.executor.cores = 8;

	auto r = algo();
	std::printf("%d\n", r);
}

Usually, the first interface is used to produce templates instead of types directly.

As the examples above show, the library can handle contextual arguments like random number generators to ensure the repeatability of the execution (in this case, up to 8 cores as specified).

Additionally, the library optimises the number of PRNG to guarantee repeatability. Without optimisation, the number of PRNG is linear because it must be equal to the number of tasks using a PRNG to run (red). When the possible numbers of cores are from 1 to 64, it can be reduced (blue). When the possible numbers of cores are powers of 2 up to 64, it becomes cyclic with a low maximum (orange).

Organisation

Main directories:

  • src/alsk: the library sources;
  • examples: some examples using the library.

Usage

To produce the Makefile and build the project:

mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make

To run examples:

./build/examples/${example_name}