Service Systems with On-Demand and Reserved Servers
In many real-world applications, there is a growing trend to supplement long-term reserved capacity with short-term on-demand capacity. We study a queueing system that employs both reserved and (relatively more expensive) on-demand servers – the number of reserved servers is decided at the beginning of the time horizon while the number of on-demand servers is decided dynamically as needed, in real time. The objective is to minimize the infinite-horizon discounted cost incurred in the hiring of servers and in the waiting of the jobs in the queue. We provide a comprehensive analysis by considering two distinct operational modes: one where service, once started, is non-preemptive, and another where preemption is allowed. For the non-preemptive setting, we refer to the jobs in the system excluding those being served by on-demand servers as “relevant” jobs. Using a sample-path analysis, we show that the optimal on-demand control is a hire-down-to policy: If the number of relevant jobs in the system is below a target level, then no on-demand servers are employed. Otherwise, the number of on-demand servers is chosen to reduce the number of relevant jobs in the system to the target level. For the preemptive setting, we establish that the optimal on-demand control is a threshold-based bang-bang policy: If the number of jobs in the system is below a threshold, then no on-demand servers are employed. Otherwise, the number of on-demand servers is chosen such that no jobs wait. For each setting, we present algorithms for obtaining ε-optimal solutions for the number of reserved servers and the respective policy parameters (target level or threshold).

