Rate Limiting in Elixir

Rate limiting is an important mechanism for controlling resource utilization and maintaining quality of service. Elixir elegantly supports rate limiting with processes and messages.

defmodule RateLimiter do
  use GenServer

  def start_link(opts \\ []) do
    GenServer.start_link(__MODULE__, :ok, opts)
  end

  def init(:ok) do
    {:ok, %{requests: [], limiter: nil}}
  end

  def handle_call(:request, _from, state) do
    Process.send_after(self(), :tick, 200)
    {:reply, :ok, %{state | requests: state.requests ++ [Time.utc_now()]}}
  end

  def handle_info(:tick, state) do
    case state.requests do
      [request | rest] ->
        IO.puts("request #{request}")
        {:noreply, %{state | requests: rest}}
      [] ->
        {:noreply, state}
    end
  end
end

# First we'll look at basic rate limiting. Suppose
# we want to limit our handling of incoming requests.
{:ok, limiter} = RateLimiter.start_link()

# Simulate 5 incoming requests
for i <- 1..5 do
  GenServer.call(limiter, :request)
end

# This will ensure that we process 1 request every 200 milliseconds.

# Now let's implement a bursty rate limiter
defmodule BurstyRateLimiter do
  use GenServer

  def start_link(opts \\ []) do
    GenServer.start_link(__MODULE__, :ok, opts)
  end

  def init(:ok) do
    # Fill up the state to represent allowed bursting.
    state = %{requests: [], tokens: List.duplicate(:token, 3)}
    schedule_token()
    {:ok, state}
  end

  def handle_call(:request, _from, state) do
    case state.tokens do
      [_token | rest_tokens] ->
        new_state = %{state | tokens: rest_tokens, requests: state.requests ++ [Time.utc_now()]}
        Process.send(self(), :process_request, [])
        {:reply, :ok, new_state}
      [] ->
        {:reply, :rate_limited, state}
    end
  end

  def handle_info(:add_token, state) do
    new_state = %{state | tokens: [:token | state.tokens] |> Enum.take(3)}
    schedule_token()
    {:noreply, new_state}
  end

  def handle_info(:process_request, %{requests: [request | rest]} = state) do
    IO.puts("request #{request}")
    {:noreply, %{state | requests: rest}}
  end

  def handle_info(:process_request, state), do: {:noreply, state}

  defp schedule_token do
    Process.send_after(self(), :add_token, 200)
  end
end

# Now simulate 5 more incoming requests. The first
# 3 of these will benefit from the burst capability
# of BurstyRateLimiter.
{:ok, bursty_limiter} = BurstyRateLimiter.start_link()

for i <- 1..5 do
  GenServer.call(bursty_limiter, :request)
end

Running our program we see the first batch of requests handled once every ~200 milliseconds as desired.

For the second batch of requests, we serve the first 3 immediately because of the burstable rate limiting, then serve the remaining 2 with ~200ms delays each.

This Elixir implementation uses GenServers to manage the state and timing of our rate limiters. The basic rate limiter processes one request every 200ms, while the bursty rate limiter allows for up to 3 requests to be processed immediately before falling back to the 200ms delay.

The RateLimiter module implements a simple rate limiter that processes one request every 200ms. The BurstyRateLimiter module implements a more complex rate limiter that allows for bursts of up to 3 requests, while still maintaining an overall rate limit.

In both cases, we’re using Elixir’s powerful concurrency primitives to manage the timing and state of our rate limiters. This approach allows us to handle rate limiting in a clean, functional way that’s idiomatic to Elixir.