- The principle is – The algorithm decides which server will respond to each request by picking two random servers from the fleet and choosing the one with the fewest active connections.
- The purpose is – Save the load balancer from the cost of having to check all servers, while still making a better choice than a purely random decision.
By randomly picking a small number of entries among a list and then selecting the least loaded one, the probability of choosing an overloaded server decreases. This is especially true as the number of servers in the fleet grows and the distribution of selected servers widens. The system balances itself: The wider the distribution, the fairer the outcome.