Line Filters in Cilk

Here’s the translation of the Go line filter program to Cilk:

#include <iostream>
#include <string>
#include <algorithm>
#include <cilk/cilk.h>

int main() {
    std::string line;

    // Read lines from stdin
    while (std::getline(std::cin, line)) {
        // Use Cilk to parallelize the uppercase conversion
        cilk_for (int i = 0; i < line.length(); ++i) {
            line[i] = std::toupper(line[i]);
        }

        // Write out the uppercased line
        std::cout << line << std::endl;
    }

    // Check for errors during input
    if (std::cin.bad()) {
        std::cerr << "error: " << std::strerror(errno) << std::endl;
        return 1;
    }

    return 0;
}

A line filter is a common type of program that reads input on stdin, processes it, and then prints some derived result to stdout. grep and sed are common line filters.

Here’s an example line filter in Cilk that writes a capitalized version of all input text. You can use this pattern to write your own Cilk line filters.

The program uses std::getline to read input line by line from std::cin. For each line, it uses a cilk_for loop to parallelize the uppercase conversion of each character. This demonstrates how Cilk can be used to easily parallelize operations on the input data.

The std::toupper function is used to convert each character to uppercase. After processing, the uppercased line is printed to std::cout.

After the main loop, the program checks for any errors that might have occurred during input. If an error is detected, it prints an error message to std::cerr and exits with a non-zero status.

To try out our line filter, first make a file with a few lowercase lines.

$ echo 'hello'   > /tmp/lines
$ echo 'filter' >> /tmp/lines

Then compile and use the line filter to get uppercase lines.

$ g++ -fcilkplus line_filter.cpp -o line_filter
$ cat /tmp/lines | ./line_filter
HELLO
FILTER

This example demonstrates how Cilk can be used to create a parallel line filter, potentially improving performance for processing large amounts of text data.