C++ Fork-Join Pattern: Real-World Engineering Guide
C++ Fork-Join Pattern: Real-World Engineering Guide
Problem Solved
Divide a task into smaller tasks recursively, process in parallel, then join results.
How It Works
- Fork: Split task into sub-tasks
- Join: Merge results at the end
- Recursive divide-and-conquer approach
STL Usage
#include <future>
#include <vector>
#include <algorithm>
#include <thread>
#include <numeric>
using namespace std;
template<typename InputIt, typename Func>
auto forkJoin(InputIt first, InputIt last, Func func, size_t threshold)
-> decltype(func(first, last)) {
size_t size = distance(first, last);
// Base case: process directly
if (size <= threshold) {
return func(first, last);
}
// Fork: split into two halves
auto mid = first + size / 2;
auto left_future = async(launch::async, [=]() {
return forkJoin(first, mid, func, threshold);
});
auto right_result = forkJoin(mid, last, func, threshold);
auto left_result = left_future.get();
// Join: combine results (simplified - actual join depends on operation)
return left_result + right_result; // Example: sum
}
// Parallel for_each using fork-join
template<typename InputIt, typename Func>
void parallelForEach(InputIt first, InputIt last, Func func, size_t threshold = 1000) {
size_t size = distance(first, last);
if (size <= threshold) {
for_each(first, last, func);
return;
}
auto mid = first + size / 2;
auto left_future = async(launch::async, [=]() {
parallelForEach(first, mid, func, threshold);
});
parallelForEach(mid, last, func, threshold);
left_future.wait();
}
Example
#include <iostream>
using namespace std;
int parallelSum(const vector<int>& data, size_t threshold = 1000) {
return forkJoin(
data.begin(), data.end(),
[](auto first, auto last) {
return accumulate(first, last, 0);
},
threshold
);
}
void forkJoinExample() {
vector<int> data(10000);
iota(data.begin(), data.end(), 1);
int sum = parallelSum(data);
cout << "Sum: " << sum << endl;
}
Use Cases
- Parallel algorithms: Divide and conquer
- Sorting: Parallel merge sort, quick sort
- Tree processing: Parallel tree traversal
- Scientific computing: Parallel numerical methods
Key Takeaways
- Natural for divide-and-conquer algorithms
- Recursive parallelization
- Built into many frameworks (Java ForkJoinPool, C++ parallel algorithms)
- Efficient for CPU-bound tasks
Things to Be Careful About
- Threshold: Too small causes overhead, too large limits parallelism
- Load balancing: Ensure even work distribution
- Memory: Recursive calls use stack space
- Overhead: Task creation has cost
Summary
Fork-Join enables efficient recursive parallelization, ideal for divide-and-conquer algorithms and parallel processing.