Distributed Job Queues With Redis and Bull


Note: I use web servers as an example here to avoid getting into too much abstract terminology, but be aware that these concepts can apply to any kind of service-oriented software.

What is a distributed job queue?

A distributed job queue is a queue of jobs or tasks that can be shared between multiple (horizontally scalable) server instances/hosts.

Why would you want one?

It can be used to solve certain kinds of scalability problems.

For example, if a server needs to perform an operation such as processing a large batch of data, processing everything at once could hog the server’s resources, while other servers might go underutilized. Also, if the operation takes a long time, it is vulnerable to leaving data in an invalid state in the event of a crash.

A distributed job queue enables this kind of operation to be broken down into smaller jobs which can be divided among many servers, and should also enable rate limiting of the jobs and the ability to recover from crashes.

Redis

Redis is an in-memory data store commonly used for caching purposes, though it is useful for much more. It is the backbone of the example demonstrated below. It maintains a record of jobs in the queue and notifies servers of jobs ready to be processed using its pub/sub mechanism.

Bull

Bull is a package for Node.js that utilizes Redis to easily create distributed job queues with a surprisingly small amount of code. It has nice features like rate limiting and the ability to execute jobs within multiple Node processes to fully utilize the resources of the underlying machine.

Example Code

This example demonstrates how to set up a basic job queue that will be distributed between the servers that run the code and point to the same Redis URL.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
  import Queue from 'bull'
  // Some arbitrary async function, for demonstration purposes
  import { someAsyncFunction } from './someModule'

  // Create a queue, passing a name and the URL of your Redis server
  const queue = new Queue('myQueue', process.env.REDIS_URL);

  // Create a job processor
  // This is a function that takes some job data and does whatever you need it to
  const jobProcessor = async (job) => {
    console.log(`About to do some work with this data: ${ job.data }`);
    const result = await someAsyncFunction(job.data);
    return result;
  };

  // Enable the queue to start processing jobs
  queue.process(jobProcessor);

  // Add a job to the queue
  queue.add({ someJobData: 123 });