Regular Expressions in Cilk
#include <stdio.h>
#include <cilk/cilk.h>
#include <pcre.h>
#include <string.h>
void print_matches(pcre *re, const char *subject) {
int ovector[30];
int rc = pcre_exec(re, NULL, subject, strlen(subject), 0, 0, ovector, 30);
if (rc < 0) {
switch (rc) {
case PCRE_ERROR_NOMATCH: printf("No match\n"); break;
default: printf("Matching error %d\n", rc); break;
}
return;
}
for (int i = 0; i < rc; i++) {
char *substring_start = subject + ovector[2*i];
int substring_length = ovector[2*i+1] - ovector[2*i];
printf("%.*s\n", substring_length, substring_start);
}
}
int main() {
pcre *re;
const char *error;
int erroffset;
// This tests whether a pattern matches a string.
re = pcre_compile("p([a-z]+)ch", 0, &error, &erroffset, NULL);
if (re == NULL) {
printf("PCRE compilation failed at offset %d: %s\n", erroffset, error);
return 1;
}
int result = pcre_exec(re, NULL, "peach", 5, 0, 0, NULL, 0);
printf("%s\n", result >= 0 ? "true" : "false");
// Many methods are available on these structs. Here's
// a match test like we saw earlier.
result = pcre_exec(re, NULL, "peach", 5, 0, 0, NULL, 0);
printf("%s\n", result >= 0 ? "true" : "false");
// This finds the match for the regexp.
print_matches(re, "peach punch");
// The Submatch variants include information about
// both the whole-pattern matches and the submatches
// within those matches.
print_matches(re, "peach punch");
// The All variants of these functions apply to all
// matches in the input, not just the first. For
// example to find all matches for a regexp.
print_matches(re, "peach punch pinch");
// Providing a non-negative integer as the second
// argument to these functions will limit the number
// of matches.
// Note: In Cilk, we don't have a direct equivalent to limit matches,
// so we'll just print all matches.
print_matches(re, "peach punch pinch");
// The regexp package can also be used to replace
// subsets of strings with other values.
char result_str[100];
pcre_substitute(re, NULL, "a peach", 7, 0, 0, PCRE_SUBSTITUTE_GLOBAL,
"<fruit>", 7, result_str, sizeof(result_str));
printf("%s\n", result_str);
pcre_free(re);
return 0;
}
This Cilk program demonstrates the use of regular expressions using the PCRE (Perl Compatible Regular Expressions) library. Here’s a breakdown of the example:
We include necessary headers and define a helper function
print_matches
to print all matches found by a regular expression.In the
main
function, we compile a regular expression pattern “p([a-z]+)ch” usingpcre_compile
.We test if the pattern matches a string using
pcre_exec
.We demonstrate finding matches in a string.
We show how to find all matches in a string.
We demonstrate replacing substrings that match the pattern with another string using
pcre_substitute
.
Note that Cilk doesn’t have a built-in regular expression library like Go does. Instead, we’re using the PCRE library, which is a powerful and widely-used regular expression library in C and C-like languages.
To compile and run this program, you would need to link against the PCRE library. Here’s an example compilation command:
$ cilk++ -o regex_example regex_example.cilk -lpcre
$ ./regex_example
This will compile the Cilk program and link it with the PCRE library, then run the resulting executable.
Remember that Cilk is an extension of C/C++ for parallel computing, so most C/C++ libraries and techniques can be used in Cilk programs. The parallel features of Cilk are not utilized in this example as regular expression operations are typically sequential.