Make your application more resilient with retries

Resilience

Temporary hick-ups occurs from time to time and are quite difficult, if not impossible to entirely avoid. For example, if you are integrating with a REST service, or using a database to manage your data it is likely that at some point that your application won’t be able to reach the service or database. For example, Azure, which is considered to be a highly available service with almost no downtime can still only guarantee an uptime of 99.9%[1].

Dealing with temporary outages

So how do you prepare for this? Just because the database or a service is down for a couple of seconds, that doesn’t mean that your application has to fail. We are going to look into how you can make your application more resilient by utilizing retries. Throughout the text, I will use a retry library which I wrote in Java. But this is of course not limited to that only. There is a ton of different libraries available for handling retry mechanisms for you, and it is not that difficult to implement it by yourself as well. It is important to note though that even though you can use retries it does not always make sense to do so. Carefully evaluate your code and ask yourself if it makes sense or not to have retries on your functions.

Before we begin:

  1. You can find the library what I will use in my examples can on my GitHub. I do not recommend you using that library for your production code as it was something that was written up alone by me in a day or two, instead, you should look into using something more robust with a circuit breaker to prevent putting extra load on the service when it is attempting to recover.
  2. Please note that I will not show you how to implement different retry strategies. Instead, I will rather talk about it on a more higher abstract level and what you should consider when deciding to use retries.
  3. If you are interested in more implementation details, I recommend you to check out my code on GitHub.

Retries

First of all, we need to decide on what type of retry strategy that we want to have. Is it enough to try again after X amount of seconds? Or do you prefer something more robust which retries based on some routine? A good example would be a Fibonacci[2] strategy where you first wait 10ms, then 20ms, then 30ms, then 50ms, and so on. Obviously, you will have to consider the scaling here and perhaps put a cap on how high it can go before giving up. This cap is extra important if you are performing it synchronously and a user has to wait for it. There is a ton of different approaches what you could take when it comes to retry strategies, the limit is only your imagination!

You should also consider if it is possible to perform the retries asynchronously so that you don’t have to worry about users waiting and can let it scale higher. Obviously, this is not always applicable but there are certainly cases where it is useful. For example, it can be useful for writing metrics, events, notifications and similar things where it is unnecessary that the user has to wait for that the task to complete.

Lastly, you will probably want to consider supporting retries on functions with return values. For void functions, we can simply pass in a Runnable[3]. But for actual return values, we can use a Supplier[4] from Java 8 where we provide a supplier method that will be retried if a failure would occur. For asynchronous
retries we can also utilize another neat Java 8 feature called CompletableFuture[5] as the return value from the retries. The CompletableFuture is non blocking and we can pass a Lambda or a method reference to execute when it is completed.

Summary

  • Use retries when possible. Retries are not difficult to implement and it can make your application seem stable even when there are temporary outages in the different component or services that you use.
  • Carefully consider the time between each retry.
  • Use asynchronous retries when it is possible and makes sense. A user is most likely not willing to wait for that many seconds without getting frustrated.

Try it yourself

If you are interested in trying out the retry mechanism that I’ve been using in my examples, feel free to do it by either grabbing the code or if you have a Maven project, just include the dependency below.

If you are using Spring there is also a much more feature-packed library available which makes dealing with retries easy. Check it out here: spring-retry

If you enjoyed the post, please help me out by giving it a like and sharing it on social media!

Sources

You may also like

Leave a Reply

Your email address will not be published. Required fields are marked *