Request queuing is the actual measure of how many HTTP requests are sitting inside of a Unix socket or TCP/IP socket and are waiting to be picked up by an application worker process, and how long, on average, those requests sit idle before being worked. If the requests keep piling up faster than they could be processed then it leads to request queuing. In that case, requests will “queue up” inside an application server or Unix socket while waiting for a worker to become available.
If multiple clients simultaneously issue requests, then each request is queued and processed in the order received on a per connection basis. The client does not have to wait for a response from one request before issuing another. Incoming requests are not prioritized. Multiple requests from a single client are handled on a first-in, first-out basis. Generally, requests are answered in the order received. Invalid requests are responded to immediately, despite any other valid requests in the queue.
Compute nodes will handle the requests and thereby process. So the queue will act as a buffer between requests and processing services. So, heavy service requests are handled without any failures and reduce end point load on a heavily requested API.
A request being queued indicates that:
- The request was postponed by the rendering engine because it’s considered lower priority than critical resources (such as scripts/styles). This often happens with images.
- The request was put on hold to wait for an unavailable TCP or Unix socket that’s about to free up.
- The request was put on hold because the program/API only allows fixed connections per origin.
- Time spent making disk cache entries (typically very quick.)
This indicates that too many resources are being retrieved from a single domain. Different programs enforce different numbers of TCP connections per host. For example, Chrome enforces a maximum of six TCP connections per host. If you are requesting twelve items at once, the first six will begin and the last half will be queued. Once one of the original half is finished, the first item in the queue will begin its request process.