# Message Passing Interface - Prime Number Calculation

In this tutorial we implement the calculation of prime numbers by means of the Message Passing Interface (MPI). We assume, there are N processors. The program shall find all prime numbers in `[2, range − 1]`.

## Implementation

We start the implementation with the imports.

``````#include <iostream>
#include <cmath>
#include <vector>
#include <stdint.h>
#include "mpi.h" // Message Passing Interface header
using std::cout;
using std::endl;
``````

Predicate, checking whether a number is prime.

``````bool isprime(int number) {
for (size_t i = 2; i <= sqrt(number); i++) {
if (number % i == 0) {
return false;
}
}
return true;
}
``````

Vector holding all the found primes.

``````std::vector<int32_t> primes;
``````

### Main method implementing the algorithm.

Method head and initialization of the prime calculation.

``````int main(int argc, char* argv[]) {
// specify the number up to which primes should be checked
size_t range = 1000;
if (argc == 2) {
range = atoi(argv[1]);
}
``````

MPI initialization.

``````	int rank;
int size;
MPI_Init(0, 0);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
if (rank == 0) {
cout << "range: " << range << endl;
}
``````

Set the start and end number for each processor to check. Uniformly assign a range of numbers to each processor based on the current rank: rank 0 shall get responsible for `[2, (range/N) - 1]`, rank 1 for `[range/N, (2 * range/N) - 1]` and so on.

``````	int my_start = rank * (range / size);
int my_end = (rank + 1) * (range / size) - 1;
``````

Set the start of process zero to one since 0 and 1 are no interesting numbers.

``````	if (rank == 0) {
my_start = 2;
}
``````

Just to have clean output.

``````	MPI_Barrier(MPI_COMM_WORLD);
``````

check each number if its a prime

``````	for (size_t i = my_start; i <= my_end; i++) {
if (isprime(i)) {
primes.push_back(i);
}
}
``````

Now every processor has a local array of primes, lets output the count.

``````	cout << "Processor " << rank << "/" << size << " found " << primes.size() << " prime numbers." << endl;
int local_count, global_count;
local_count = primes.size();
``````

Reduce the number of primes and allocate

``````	MPI_Reduce(&local_count, &global_count, 1, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD);
std::vector<int32_t> result;
if (rank == 0) {
cout << "all processors found " << global_count << " primes." << endl;
result.resize(global_count);
// just for convenience, that we see what has been written
for (auto& r : result) {
r = -1;
}
}
``````

Gather the individual numbers.

``````	std::vector<int> nums(size);
MPI_Gather(&local_count, 1, MPI_INT, &nums[0], 1, MPI_INT, 0, MPI_COMM_WORLD);
if (rank == 0) {
for (size_t i = 0; i < size; i++) {
cout << i << ":" << nums[i] << "\n";
}
}
``````

Calculate the storage locations for the resulting data.

``````	std::vector<int> disps(size);
for (int i = 0; i < size; i++) {
disps[i] = (i > 0) ? (disps[i - 1] + nums[i - 1]) : 0;
}
``````

Gather all prime numbers (different count from each processor!).

``````	MPI_Gatherv(&primes[0], local_count, MPI_INT, &result[0], &nums[0], &disps[0], MPI_INT, 0, MPI_COMM_WORLD);
``````

Output the found prime numbers.

``````	if (rank == 0) {
cout << "received: " << result.size() << endl;
if (range < 10000) {
for (auto& r : result) {
cout << r << " ";
}
cout << endl;
}
else {
cout << "Omit output as range is large." << endl;
}
}
``````

Finalize the MPI environment.

``````	MPI_Finalize();

return 0;
}
``````

### Compile and test

Assuming previous code is contained in file with name `mpi_prime.cpp`, you can compile it as follows.

``````mpicxx -Ofast -std=c++11 -o mpi_prime mpi_prime.cpp
``````

Please note: `mpicxx` is a specific compiler which arrives with MPICH an implementation of MPI paradigm.

# Online Resources

A large MPI tutorial.