Rate Limiting in Cilk

#include <cilk/cilk.h>
#include <iostream>
#include <chrono>
#include <thread>
#include <vector>
#include <queue>

using namespace std;
using namespace std::chrono;

void simulateRequest(int req, milliseconds delay) {
    this_thread::sleep_for(delay);
    cout << "request " << req << " " << system_clock::now().time_since_epoch().count() << endl;
}

int main() {
    // First we'll look at basic rate limiting. Suppose
    // we want to limit our handling of incoming requests.
    // We'll serve these requests off a queue.
    queue<int> requests;
    for (int i = 1; i <= 5; i++) {
        requests.push(i);
    }

    // This limiter will delay each request by 200 milliseconds.
    // This is the regulator in our rate limiting scheme.
    milliseconds limiter(200);

    // By introducing a delay before serving each request,
    // we limit ourselves to 1 request every 200 milliseconds.
    while (!requests.empty()) {
        int req = requests.front();
        requests.pop();
        cilk_spawn simulateRequest(req, limiter);
    }
    cilk_sync;

    cout << endl;

    // We may want to allow short bursts of requests in
    // our rate limiting scheme while preserving the
    // overall rate limit. We can accomplish this by
    // using a token bucket algorithm.
    vector<int> burstyRequests;
    for (int i = 1; i <= 5; i++) {
        burstyRequests.push_back(i);
    }

    int tokens = 3; // Allow bursts of up to 3 events
    auto lastTime = system_clock::now();

    for (int req : burstyRequests) {
        auto now = system_clock::now();
        auto timePassed = duration_cast<milliseconds>(now - lastTime);
        tokens += timePassed.count() / 200; // Add tokens based on time passed
        if (tokens > 3) tokens = 3; // Cap at 3 tokens

        if (tokens > 0) {
            tokens--;
            cilk_spawn simulateRequest(req, milliseconds(0));
        } else {
            auto waitTime = milliseconds(200) - (timePassed % milliseconds(200));
            cilk_spawn simulateRequest(req, waitTime);
        }

        lastTime = now;
    }
    cilk_sync;
}

This Cilk code demonstrates rate limiting, which is an important mechanism for controlling resource utilization and maintaining quality of service. Cilk supports rate limiting through its parallel constructs and standard C++ libraries.

In this example, we showcase two approaches to rate limiting:

  1. Basic rate limiting: We process requests at a fixed rate of one every 200 milliseconds.

  2. Bursty rate limiting: We allow short bursts of requests while maintaining the overall rate limit using a simple token bucket algorithm.

For the basic rate limiting, we use a queue to store the requests and process them with a fixed delay between each request.

For the bursty rate limiting, we use a vector to store the requests and a token bucket algorithm to control the rate. This allows up to 3 requests to be processed immediately (the burst), after which the rate limiting kicks in.

The cilk_spawn keyword is used to create parallel tasks for each request simulation, allowing for concurrent processing. The cilk_sync keyword ensures all spawned tasks are completed before the program exits.

When you run this program, you’ll see the first batch of requests handled once every ~200 milliseconds as desired. For the second batch of requests, you’ll see the first few processed immediately (the burst), followed by the remaining requests with delays between them.

This example demonstrates how Cilk can be used to implement rate limiting in a concurrent environment, providing control over resource usage while allowing for bursts of activity when needed.