What is Polly?


We’ve all experienced this at one point or another: You’ve written perfect, flawless code, and yet, when used in the wild, it doesn’t work. How could that be? What did you do wrong?

Usually, the answer is “nothing.” Some things are just out of your control. Sometimes the network is unreliable, the database is slow or someone else’s code fails. These errors are frustrating, and not easy to resolve. How can you retry a request to a remote service that is unreliable? What if you need to re-authenticate before retrying? You’ve probably seen homemade solutions like loops, try catches and if-else, but I’m here to show you there’s a better way.

Polly is a resilience framework for .NET available as a .NET Standard Library so it can run on your web services, desktop apps, mobile apps and inside your containers—anywhere .NET can run.

In the past two years, Polly has been downloaded over 100 million times, and it’s easy to see why. With only a few lines of code, Polly can retry failed requests, cache previous responses, protect your resources, prevent you from making requests to broken services, terminate requests that are taking too long and return a default value when all else fails. It’s also thread-safe and works on sync and async calls.

Now that you understand why people are gravitating toward Polly, let’s run through the main features and how you can use them.

Retry

As the name suggests, the Retry policy lets you retry a failed request due to an exception or an unexpected or bad result returned from the called code. It doesn’t wait before retrying, so be careful. If the problem that caused the request to fail is not likely to resolve itself almost immediately, retrying might not help; it might even make matters worse. The Retry policy lets you define how many retries should occur before it gives up.

Retrying in the event of an exception (up to three retries will be performed)

Retrying if the response is false and you expected true


You might be wondering how this will help you if the error you get back is an authorization error. That’s where onRetry delegate really comes in handy, by letting you execute any code prior to performing the retry.

Reauthenticating before retrying

Wait and Retry

Think of all the times your application failed due to some small transient fault in the network or infrastructure you depended on, only to resolve itself just a few moments later. The Wait and Retry policy lets you pause before retrying, a great feature for scenarios where all you need is a little time for the problem to resolve.
Just like the Retry, the Wait and Retry policy can handle exceptions and bad results in called code. It also takes one extra parameter: the delay to add before each retry. The delay can be defined in a variety of ways, such as a fixed period, an array of periods or a function that calculates the delay before each retry.

Waiting and retrying if the http request returns a failure

Circuit Breaker

If a method you’re calling or a piece of infrastructure you depend on becomes very unreliable, the best thing to do might be to stop making requests to it for a moment. The Circuit Breaker policy lets you do this. Think of it as the circuit breaker in your home electrical system; if a fault is discovered, the circuit breaks. In the same way, if a resource you depend on has a fault, you break the circuit to it.
Polly offers two implementations of the circuit breaker: the Basic Circuit Breaker, which breaks when a defined number of consecutive faults occur, and the Advanced Circuit Breaker, which breaks when a threshold of faults occur within a time period, during which a high enough volume of requests were made.

Basic Circuit Breaker example


In the example above, the circuit breaks if there are two consecutive failures in a 60-second window.When you need a little more nuance, use the Advanced Circuit Breaker.

Advanced Circuit Breaker example


In the above example, the circuit breaks for 10 seconds if there is a 1% failure rate in a 60-second window, with a minimum throughput of 1,000 requests. You might use this Advanced Circuit Breaker when you know that some percentage of requests will be lost, but you know your application can tolerate it. Or, when you have bursty traffic and a few consecutive errors don’t indicate a serious fault.
For both Circuit Breakers, when the circuit breaks, all requests to the resource are rejected immediately and a BrokenCircuitException is thrown. After the defined period, the Circuit Breaker will allow one request through. This is considered a test. If the test request succeeds, the circuit returns to normal (closed) and all requests are allowed. But if the test request fails, the circuit remains open for another defined period before again transitioning to the test state.

Fallbacks

Sometimes a request is going to fail no matter how many times you retry. The Fallback policy lets you return some default or perform an action like paging an admin, scaling a system or restarting a service.
Fallbacks are generally used in combination with other policies like Retry or Wait and Retry inside a wrap. (See below.) In these instances, the retries occur first, and if the problem is not resolved, then the Fallback executes.

Fallback policy paging an administrator

Policy Wraps

When you want to use polices together, use a Policy Wrap. Wraps allow any number of policies to be chained together. In this example, the fallbackPolicy wraps the retryPolicy which wraps the timeoutPolicy.

Example of Policy Wrap

Timeout

Some tools like HttpClient already provide a timeout, but many don’t. The Timeout policy lets you (the caller) decide how long any request should take. If the request takes longer than specified, the policy will terminate the request and cleanup resources via the usage of a cancellation token.

Timeout after one second if no response is received

Cache

Polly’s Cache policy lets you store the results of a previous request in memory or on a distributed cache. If a duplicate request is made, Polly will return the stored result from the cache rather than hitting the underlying service a second time. This is especially useful when the requests are to remote systems. The Polly Cache supports multiple time-to-live (TTL) strategies, including relative, absolute, sliding and result. The resulting strategy is used in scenarios when the result of a request includes the likes of an auth token, which itself includes a TTL. The TTL from the result will determine how long the response is stored.

Example of caching a result in local memory for 10 seconds

Bulkhead Isolation

The Bulkhead Isolation policy limits the number of resources any part of your application can consume. Let’s say you have a web service that consumes multiple other web services. If one of those upstream services is unavailable, requests will start to back up on your service. Without intervention, your own service will degrade and may crash, leading to even more problems downstream.
Bulkhead Isolation policy lets you control how your application consumes memory, CPU, threads, sockets, et cetera. Even if one part of your application can’t respond, the policy prevents this from bringing down the whole application.
You specify how many concurrent requests of a given type can execute and how many can be queued for execution. If the execution and queue slots are in use, no more requests of that type can be processed.

Bulkhead Isolation policy with three execution slots and six queue slots


With features like these, it’s no surprise that Polly is getting adopted at such a rapid pace. Now that you have the basics down, take it for a whirl and see what a difference it can make in your own development cycles.



Comments

Popular Posts