Worker Pools in Python

Our example demonstrates how to implement a worker pool using threads and queues in Python.

import threading
import queue
import time

# Here's the worker function, of which we'll run several
# concurrent instances. These workers will receive
# work on the `jobs` queue and send the corresponding
# results on `results`. We'll sleep a second per job to
# simulate an expensive task.
def worker(id, jobs, results):
    while True:
        j = jobs.get()
        if j is None:
            break
        print(f"worker {id} started  job {j}")
        time.sleep(1)
        print(f"worker {id} finished job {j}")
        results.put(j * 2)
    print(f"worker {id} exiting")

def main():
    # In order to use our pool of workers we need to send
    # them work and collect their results. We make 2
    # queues for this.
    num_jobs = 5
    jobs = queue.Queue()
    results = queue.Queue()

    # This starts up 3 workers, initially blocked
    # because there are no jobs yet.
    workers = []
    for w in range(1, 4):
        t = threading.Thread(target=worker, args=(w, jobs, results))
        t.start()
        workers.append(t)

    # Here we send 5 `jobs` and then send a `None` value
    # for each worker to indicate that's all the work we have.
    for j in range(1, num_jobs + 1):
        jobs.put(j)
    for _ in workers:
        jobs.put(None)

    # Finally we collect all the results of the work.
    # This also ensures that the worker threads have
    # finished.
    for _ in range(num_jobs):
        results.get()

    # Wait for all worker threads to finish
    for w in workers:
        w.join()

if __name__ == "__main__":
    main()

Our running program shows the 5 jobs being executed by various workers. The program only takes about 2 seconds despite doing about 5 seconds of total work because there are 3 workers operating concurrently.

$ time python worker_pools.py
worker 1 started  job 1
worker 2 started  job 2
worker 3 started  job 3
worker 1 finished job 1
worker 1 started  job 4
worker 2 finished job 2
worker 2 started  job 5
worker 3 finished job 3
worker 1 finished job 4
worker 2 finished job 5
worker 1 exiting
worker 2 exiting
worker 3 exiting

real    0m2.058s

This example demonstrates how to use Python’s threading and queue modules to create a worker pool. The worker function represents each worker thread, which continuously pulls jobs from the jobs queue, processes them, and puts the results into the results queue. The main function sets up the worker threads, distributes the jobs, and collects the results.

Note that Python’s Global Interpreter Lock (GIL) can limit the performance benefits of threading for CPU-bound tasks. For CPU-intensive operations, you might want to consider using the multiprocessing module instead, which uses separate processes to achieve true parallelism.