Beginners guide to retry pattern implementation with exponential backoff

Beginners Guide To Retry Pattern Implementation With Exponential Backoff

The inclusion of Retry Pattern in development space is due to transient failures, which are more prominent in a cloud-based development environment. Transient failures occur in development at a time when an element is communicating with an external component, but that external component or service is not available.

However, the issue is temporary, and when we call the service or component again, it will connect. These temporary limitations are called transient failures, and in a cloud environment, the chances of these failures have increased.

As the cloud computing system is distributed across the spectrum on different components and applications, there are issues with hosting. Some momentary loss of network or unavailability of service further adds to the issues. The RetryPattern Implementation is a solution to bypass these issues.

But Then Why do we Use Exponential Backoff?

The idea behind using exponential backoff with retry is that instead of retrying after waiting for a fixed amount of time, we increase the waiting time between retries after each retry failure.

For instance, when we use SFTP to transfer and copy files, it takes time to finish the task. We can use this time to execute the ExponentialBackoff function.

To take advantage of the exponential backoff, we can use the open-source library (Polly). Polly library supports the 4.0 framework and higher and is also compatible with .NET Core.

To utilize Polly, we need to install the Polly Package from the NuGet package on Visual Studio.

Polly

Here we will share only the concept of how we can use the Polly library to check whether a file is available on the path or not.

For that, we will create a function to check if the file is available on a particular path or not.

public static bool IsAvailableFile(string filePath) 

        { 

            Policy retryPolicy = Policy.Handle<IOException>().WaitAndRetry(6, i => TimeSpan.FromSeconds(Math.Pow(2, i))); 

            retryPolicy.Execute(() => 

            { 

                using (FileStream stream = File.OpenRead(filePath)) 

                { 

                    stream.Close(); 

                } 

            }); 

            return true;   } 

In the above code, we can see that we passed the file path to the function, and it will check whether the file is fully written. If not, it will wait for a particular time, retry again, and recheck the file status.

Accordingly, it will return the true value of the file that was written successfully. Now we can use that file for further processing. The above code can be used anywhere as per our requirements.

Besides checking whether the file is written or not, we can also use the Polly Library for other things. This includes;

  • Retry
  • Circuit Breaker
  • Timeout
  • Bulkhead Isolation
  • Fallback in a fluent and thread-safe manner

The code script below shows how we can use HTTP Retires integrated with Polly into lHTTPClientFactory.

lHttpClientFactory is available since Dot Net Core 2.1. Hence, it is recommended to use the latest .Net 5 packages from NuGet.

Configure a client with Polly retry in startup.cs

//ConfigureServices()  - Startup.cs

services.AddHttpClient<IBasketService, BasketService>()

.SetHandlerLifetime(TimeSpan.FromMinutes(5))  //Set lifetime to five minutes

.AddPolicyHandler(GetRetryPolicy());

To have a more modular approach with the HTTP Retry policy, you can follow another implementation method, which is given below.

static IAsyncPolicy<HttpResponseMessage>GetRetryPolicy()
{
    return HttpPolicyExtensions
.HandleTransientHttpError()
.OrResult(msg =>msg.StatusCode == System.Net.HttpStatusCode.NotFound)
.WaitAndRetryAsync(6, retryAttempt =>TimeSpan.FromSeconds(Math.Pow(2,
retryAttempt)));
}

With the use of Polly, we can define a Retry policy with the n number of retries, and the actions can be taken when there’s an HTTP exception, such as logging the error.

In this above case, the policy is configured to try six times with an exponential retry, starting at two seconds.

A retry policy can affect our system in case of high concurrency and scalability. To overcome peaks of similar retries coming from many clients in a partial outage, we can add the Jitter Strategy to the retry algorithm/policy.

This will improve the performance of the end-to-end system. See the sample below to check the code for Jitter Strategy.

var delay = Backoff.DecorrelatedJitterBackoffV2(medianFirstRetryDelay: TimeSpan.FromSeconds(1), retryCount: 5);
 var retryPolicy = Policy
.Handle<FooException>()
.WaitAndRetryAsync(delay);

Circuit Breaker

A circuit breaker is added to manage long-running transient failures. We can add a circuit breaker within the code to wrap the service and mark the circuit open to indicate that the service or component is not available or responding even after several tries.

If something goes wrong, we continuously hit the button that prevents further attempts to repeat the operation.

This is typically used when we have an extremely unreliable dependency. In this case, we want to stop calling it altogether, as additional attempts to call it might worsen the situation. An example of this might be an overloaded database.

Signature:

Policy
.Handle<Exception>()
.CircuitBreakerAsync(
	int exceptionsAllowedBeforeBreaking,
TimeSpandurationOfBreak,
	Action<Exception, TimeSpan>onBreak,
	Action onReset);

Handle<Exception>: Same as with Retry policies. This specifies the type of exceptions the policy can handle.

CircuitBreakerAsync(…):

int exceptionsAllowedBeforeBreaking specifies how many exceptions in a row will trigger a circuit breaker.

TimeSpandurationOfBreak specifies how long the circuit will remain broken.

Action<Exception, TimeSpan>onBreak is a delegate that allows you to perform some action (typically used for logging) when the circuit is broken.

Action onReset is a delegate that allows you to perform some action (again, typically for logging) when the circuit is reset.

In our example, We have created the circuit-breaker policy, which kicks in after 1 failure.

Although the above example is meant for demonstration, you might change this value in a real-world scenario according to the requirements and the service you are attempting to call.

public async Task<string>GetGoodbyeMessage()
{
  try
  {
Console.WriteLine($"Circuit State: {_circuitBreakerPolicy.CircuitState}");
	return await _circuitBreakerPolicy.ExecuteAsync<string>(async () =>
	{
    	return await _messageRepository.GetGoodbyeMessage();
	});
  }
  catch (Exception ex)
  {
	return ex.Message;
  }
}

This is because when a circuit-breaker policy is in a broken state, any further attempts to execute the action will automatically throw a BrokenCircuitException.

Rather than properly handling the error, we just pass the error message back as the return value — again, this is just for demonstration purposes, so we get to see what the error looks like. At first, you will notice that the error looks like this;

But when an exception first occurs, which triggers the circuit-breaker policy to break (open) the circuit, it will return the value as;

Any further attempts to call the service fails, with the BrokenCircuitException being thrown by the circuit-breaker policy.

Conclusion

RetryPattern, Exponential backoff, and Circuit Breaker are great tools to have in this modern development world. Their usage will increase in the development environment led by cloud computing, microservices, and serverless architecture.

Even though these development components are important, they must be used judiciously. Their usage is dependent on the development environment, app requirements, and service behaviors. So, make sure to look at these aspects before using any of these components.

While we are talking about the future-oriented development exercises, we at DEV IT understand the impact and importance of development on a business. When you want to build a future-ready solution, look no further. Let us know about your requirements, and we will help you create the right solution.