Distributed Job Queues With Redis and Bull
Note: I use web servers as an example here to avoid getting into too much abstract terminology, but be aware that these concepts can apply to any kind of service-oriented software.
What is a distributed job queue?
A distributed job queue is a queue of jobs or tasks that can be shared between multiple (horizontally scalable) server instances/hosts.
Why would you want one?
It can be used to solve certain kinds of scalability problems.
For example, if a server needs to perform an operation such as processing a large batch of data, processing everything at once could hog the server’s resources, while other servers might go underutilized. Also, if the operation takes a long time, it is vulnerable to leaving data in an invalid state in the event of a crash.
A distributed job queue enables this kind of operation to be broken down into smaller jobs which can be divided among many servers, and should also enable rate limiting of the jobs and the ability to recover from crashes.
Redis is an in-memory data store commonly used for caching purposes, though it is useful for much more. It is the backbone of the example demonstrated below. It maintains a record of jobs in the queue and notifies servers of jobs ready to be processed using its pub/sub mechanism.
Bull is a package for Node.js that utilizes Redis to easily create distributed job queues with a surprisingly small amount of code. It has nice features like rate limiting and the ability to execute jobs within multiple Node processes to fully utilize the resources of the underlying machine.
This example demonstrates how to set up a basic job queue that will be distributed between the servers that run the code and point to the same Redis URL.