Rate Limiting in Python
Here’s the translation of the Go rate limiting example to Python, formatted in Markdown suitable for Hugo:
Rate limiting is an important mechanism for controlling resource utilization and maintaining quality of service. Python supports rate limiting using various libraries and techniques.
First, we’ll look at basic rate limiting. Suppose we want to limit our handling of incoming requests. We’ll use a list to simulate these requests.
import time
from queue import Queue
def main():
# Simulate incoming requests
requests = Queue(maxsize=5)
for i in range(1, 6):
requests.put(i)
# This limiter function will return True every 200 milliseconds
def limiter():
while True:
yield
time.sleep(0.2)
limit = limiter()
# By calling next() on the limiter before serving each request,
# we limit ourselves to 1 request every 200 milliseconds.
while not requests.empty():
next(limit)
req = requests.get()
print(f"request {req} {time.time()}")
# We may want to allow short bursts of requests in our rate limiting scheme
# while preserving the overall rate limit. We can accomplish this by using
# a token bucket algorithm.
class TokenBucket:
def __init__(self, tokens, fill_rate):
self.capacity = tokens
self.tokens = tokens
self.fill_rate = fill_rate
self.timestamp = time.time()
def get_token(self):
now = time.time()
if self.tokens < self.capacity:
self.tokens += self.fill_rate * (now - self.timestamp)
self.timestamp = now
if self.tokens >= 1:
self.tokens -= 1
return True
return False
# This bursty_limiter will allow bursts of up to 3 events
bursty_limiter = TokenBucket(3, 0.2)
# Now simulate 5 more incoming requests. The first 3 of these will
# benefit from the burst capability of bursty_limiter.
bursty_requests = Queue(maxsize=5)
for i in range(1, 6):
bursty_requests.put(i)
while not bursty_requests.empty():
if bursty_limiter.get_token():
req = bursty_requests.get()
print(f"request {req} {time.time()}")
else:
time.sleep(0.1)
if __name__ == "__main__":
main()
Running our program, we see the first batch of requests handled once every ~200 milliseconds as desired.
$ python rate_limiting.py
request 1 1653669421.2034514
request 2 1653669421.4036515
request 3 1653669421.6038516
request 4 1653669421.8040517
request 5 1653669422.0042517
For the second batch of requests, we serve the first 3 immediately because of the burstable rate limiting, then serve the remaining 2 with ~200ms delays each.
request 1 1653669422.2044518
request 2 1653669422.2044518
request 3 1653669422.2044518
request 4 1653669422.4046519
request 5 1653669422.604652
This example demonstrates how to implement basic rate limiting and bursty rate limiting in Python. The time.sleep()
function is used to simulate delays, and a custom TokenBucket
class is implemented for the bursty rate limiting. In a real-world scenario, you might want to use more sophisticated libraries like ratelimit
or aiohttp-ratelimit
for more robust rate limiting implementations.