Load Balancer using Montecarlo Algorithms
The objective
You have N servers behind a load balancer. When a request arrives it is automatically assigned to one of the available servers. Requests arrive at random intervals according with the exponential distribution with an average of 100 requests per second. When a server receives a request, it processes the request and this takes a random time described by the pareto
distribution. The minimum time is 2 seconds and the average time is 3 seconds. Each server can process only one request at the time. One server costs $2000/month. If a request arrives and all servers are busy the request is dropped. You earn $0.01 for every request you process and you pay $0.10 in penalties for every request you drop. What is the value of N that allows you to guarantee that 90% of the requests are processed? What is the value of N that maximizes profit? What is the value of N below which you would operate at a loss?
Problem definition
To address the problem I would use monte carlo simulations to estimate the number of servers needed, N, to attend the expected number of request, Q, that follows an exponential distribution. To that I need to simulate for many values of Q to determine the overall demand, D, and determine the number of servers that allows to process 90% of the distribution of D. In addition there has to be a pipeline to complete the process and identify when a server becomes available to take a new job. At each iteration I would compute the profit of accepting requests or loss for rejecting jobs , Pi and record N and the profit Pi.
The profit is determined by the income amount per request processed minus the cost of rejecting a request and server’s cost. When the number of servers is greater than the number of request received then we would never be penalized and would only have to pay for the servers. When the number of request is larger than the number of servers we would be penalized and also would have to pay for the servers. Let x be the rate of accepted request, and S the cost per second of each server,then the profit per second would be:
Request analysis
The distribution that describes the request that arrive to the server follow a exponential distribution with an average of 100 request per second. Apriori if we want to be able to process everything we would have to have 100 server available at every second. The time to process each order then is 3 seconds so we would have accumulated 200 request if we remain with only 100 servers. Considering that we would simulate a minute of operations the raw numbers are as follows.
I would accept requests in such a way that they are fully processed when the minute is over. That means that I would stop accepting request at 57 seconds. Initial approximation of the number of request. Assuming that we always get exactly 100 request per second we would have a total of 5,700 requests. At the end of the minute would be fully processed and therefore paid. Let’s see how the distribution of ten thousand numbers generated looks like. With the following code we may store each random number generated in the data list and then plot it to see them.
The resulting distribution for lambda equal to 100 and running the loop for T<1 is shown in Figure 1. The histogram shows the distribution with a mean value of 95. That is the number of request expected per second. When we execute the code presented above running for the 56 seconds, the distribution is close to the apriori expectation at 5,602 request. When simulating for an entire minute the expected range of request is between 5,322 and 5,887 with an average of 5,602 and sigma of 74 requests. When simulating for an entire minute the expected range of request is between 5,322 and 5,887 with an average of 5,602 and sigma of 74 requests.
Processing requests
When a server receives a request, it processes the request in a minimum time of 2 seconds and average of 3 seconds. The time to process follows a pareto distribution. The simulation for the process would go for the entire minute. The result would say how many process are being processed and mac actually be account for profit. The python code that would allow to compute the process takes what has been build in the request analysis. Now we need to know the average time to process each request. The value of alpha is: (ᵯ†1)
The value of alpha is: (alpha-1) μ = alpha*Xm → alpha= μ/(μ – Xm ) → alpha= 3/(32)
Simulation
We have seen what to expect in terms of the number of requests and time needed to process. The following step is to combine both measures to determine the profit of one minute. To that end we start by defining the time T just as before it would start at zero and stop receiving request at time 57 seconds. Lambda is 100 and the time interval between request is modeled as defined above. The following python code uses the distributions and simulated once equivalent for one second.
Simulation code
class MCEngine(object):
def __init__(self, N, lamb):
self.N = N
self.lamb = lamb
self.avServers = N
def simulate_once(self): # simulate one second
N = self.N
fix_expenses = 2000.0/(30.4*24*60*60)* N # assuming months of 30.4 days
number_request = int(max(0,self.requests(57)))
if number_request <= N:
income = number_request * 0.01
else:
income = N * 0.01
profit = income max(0,number_requestN)*0.1 fix_expenses
return profit, min(0,number_requestN)*1
def measure(self, results):
return float(sum(results))/len(results)
def simulate_many(self):
self.results = []
for k in range(1,10000):
y , pro = self.simulate_once()
self.results.append(y)
self.mu = self.measure(self.results)
def requests(self, sec = 57):
lamb = self.lamb
T = 0
k = 0
while T < 1.0:#*sec/60:
t = random.expovariate(lamb)
T = T + t
k = k + 1
return k
def process(self):
xm = 2
alpha = 3 /(3xm)
return xm*random.paretovariate(alpha)
lamb = 100
data = []
for N in range(0,300,20):
engine = MCEngine(N,lamb)
engine.simulate_many()
profit = engine.mu
data.append((N, profit))
For one second we would get an average of 100 request. When we have no servers available the only cost that we face is the rejection penalty, which approximates to $10. As we start adding new servers to take the requests during the first second, break even is reached at N equal to 92 approximately. If we continue adding new servers, fixed cost will increase we get back to operate out of the money. Figure 5 shows the profit for 1 second for values of N from 0 to 30.
Results
What is the value of N that allows you to guarantee that 90% of the requests are
processed?
To solve this problem we may use little’s law to estimate the average number of request being processed in a minute, L. For that we need to compute W, the average time needed to process requests and the average number of items arriving in a minute, ƛ. Running a simulation to estimate the average of L. Distribution of the request is shown in figure 6. The number of servers needed to
guarantee that 90% of the request are processed is 26,122.
What is the value of N that maximizes profit? What is the value of N below which
you would operate at a loss?
The profit function is
Profit = income − (rejected * 0.1 + N * serverCost)
Where:
income : minimum between requests and N
rejected: requests N when requests > N
N : number of server
S: cost per server
The simulation measures the profit at many levels of N against the number of requests. Figure 7 shows the profit in the y axis and the number of servers in the x axis. The N that maximizes the profit is when having around 27 thousand servers. Breakeven is reached when having around 15 thousand active servers.