HttpClient Pitfalls in .NET

Continuing the series of articles on "pitfalls" I can not ignore the System.Net.HttpClient, which is very often used in practice, but it has several serious problems that may not be immediately visible.

A common problem in programming is that developers are focused only on the functionality of a component, while not taking into account a very important non-functional component that can affect performance, scalability, ease of recovery in case of failures, security, etc. For example, the same HttpClient seems to be an elementary component, but there are several questions: how many parallel connections it creates to the server, how long they live, how it behaves, if the DNS name that was accessed earlier is switched to a different IP address ? Let's try to answer these questions in the article.

Compound leakage
Limit of simultaneous server connections
Long-Lived Connections and DNS Caching

The first problem with HttpClient is the non-obvious leakage of connections . Quite often I have come across the code where it is created to execute each request:

publicasync Task<string> GetSomeText(Guid textId)
{
    using (var client = new HttpClient())
    {
        returnawait client.GetStringAsync($"http://someservice.com/api/v1/some-text/{textId}");
    }
}

Unfortunately, this approach leads to a large waste of resources and a high probability of overflowing the list of open connections. In order to visually show the problem, just run the following code:

staticvoidMain(string[] args)
{
    for(int i = 0; i < 10; i++)
    {
        using (var client = new HttpClient())
        {
            client.GetStringAsync("https://habr.com").Wait();
        }
    }
}

And upon completion, view the list of open connections via netstat:

PS C: \ Development \ Exercises> netstat -n | select-string -pattern "178.248.237.68"
  TCP 192.168.1.13:43684 178.248.237.68:443 TIME_WAIT
  TCP 192.168.1.13:43685 178.248.237.68:443 TIME_WAIT
  TCP 192.168.1.13:43686 178.248.237.68:443 TIME_WAIT
  TCP 192.168.1.13:43687 178.248.237.68:443 TIME_WAIT
  TCP 192.168.1.13:43689 178.248.237.68:443 TIME_WAIT
  TCP 192.168.1.13:43690 178.248.237.68:443 TIME_WAIT
  TCP 192.168.1.13:43691 178.248.237.68:443 TIME_WAIT
  TCP 192.168.1.13:43692 178.248.237.68:443 TIME_WAIT
  TCP 192.168.1.13:43693 178.248.237.68:443 TIME_WAIT
  TCP 192.168.1.13:43695 178.248.237.68:443 TIME_WAIT

Here, the –n key is used to speed up the output, since otherwise netstat for each IP will look for a domain name, and 178.248.237.68 will have the habr.com IP address at the time of this writing.

So, we see that despite the using construct, and even though the program was completely completed, the connections to the server remained “hanging”. And they will hang for as long as specified in the registry key HKEY_LOCAL_MACHINE \ SYSTEM \ CurrentControlSet \ Services \ Tcpip \ Parameters \ TcpTimedWaitDelay.

On the move, the question may arise - how does the .NET Core behave in such cases? That in Windows, that in Linux, is exactly the same, because similar retention of connections occurs at the system level, and not at the application level. The TIME_WAIT status is a special state of the socket after it is closed by the application, and this is necessary for processing packets that can still go through the network. For Linux, the duration of such a state is specified in seconds in / proc / sys / net / ipv4 / tcp_fin_timeout, and, of course, can be changed if necessary.

The second problem HttpClient is an unobvious limit on simultaneous connections to the server . Suppose you use the familiar. NET Framework 4.7, with which you develop a high-loaded service, where there are calls to other services via HTTP. A potential problem with connection leakage has been taken into account, so the same HttpClient instance is used for all requests. What could be wrong?

The problem can be easily seen by running the following code:

staticvoidMain(string[] args)
{
    var client = new HttpClient();
    var tasks = new List<Task>();
    for (var i = 0; i < 10; i++)
    {
        tasks.Add(SendRequest(client, "http://slowwly.robertomurray.co.uk/delay/5000/url/https://habr.com"));
    }
    Task.WaitAll(tasks.ToArray());
}
privatestaticasync Task SendRequest(HttpClient client, string url)
{
    var response = await client.GetAsync(url);
    Console.WriteLine($"Received response {response.StatusCode} from {url}");
}

The resource specified in the link allows you to delay the server’s response for a specified time, in this case 5 seconds.

As it is easy to notice after executing the above code, there are only 2 answers every 5 seconds, although 10 simultaneous requests were created. This is due to the fact that the interaction with HTTP in a normal .NET framework, among other things, goes through a special class System.Net.ServicePointManager that controls various aspects of HTTP connections. This class has a DefaultConnectionLimit property that indicates how many simultaneous connections can be created for each domain. And so it has historically been the case that the default value of the property is 2.

If we add to the sample code above at the very beginning

ServicePointManager.DefaultConnectionLimit = 5;

then the example execution will speed up noticeably, since requests will be executed in batches of 5.

And before moving on to how it works in .NET Core, you should say a little more about ServicePointManager. The above property indicates the number of default connections that will be used for subsequent connections to any domain. But along with this, it is possible to control the parameters for each domain name individually and this is done through the ServicePoint class:

var delayServicePoint = ServicePointManager.FindServicePoint(new Uri("http://slowwly.robertomurray.co.uk"));
delayServicePoint.ConnectionLimit = 3;
var habrServicePoint = ServicePointManager.FindServicePoint(new Uri("https://habr.com"));
habrServicePoint.ConnectionLimit = 5;

After executing this code, any interaction with Habr through the same HttpClient instance will use 5 simultaneous connections, and with the “slowwly” site - 3 connections.

Here there is another interesting nuance - the limit on the number of connections for local addresses (localhost) is int.MaxValue by default. Just look at the results of executing this code without first setting the DefaultConnectionLimit:

var habrServicePoint = ServicePointManager.FindServicePoint(new Uri("https://habr.com"));
Console.WriteLine(habrServicePoint.ConnectionLimit);
var localServicePoint = ServicePointManager.FindServicePoint(new Uri("http://localhost"));
Console.WriteLine(localServicePoint.ConnectionLimit);

Now we’ll move on to .NET Core. Although ServicePointManager still exists in the System.Net namespace, it does not affect the behavior of the HttpClient in the .NET Core. Instead, HTTP connection parameters can be controlled using HttpClientHandler (or SocketsHttpHandler, which we will talk about later):

staticvoidMain(string[] args)
{
    var handler = new HttpClientHandler();
    handler.MaxConnectionsPerServer = 2;
    var client = new HttpClient(handler);
    var tasks = new List<Task>();
    for (int i = 0; i < 10; i++)
    {
        tasks.Add(SendRequest(client, "http://slowwly.robertomurray.co.uk/delay/5000/url/https://habr.com"));
    }
    Task.WaitAll(tasks.ToArray());
    Console.ReadLine();
}
privatestaticasync Task SendRequest(HttpClient client, string url)
{
    var response = await client.GetAsync(url);
    Console.WriteLine($"Received response {response.StatusCode} from {url}");
}

The above example will behave in exactly the same way as the initial example for the usual .NET Framework - to establish only 2 connections at a time. But if you remove the line with the MaxConnectionsPerServer property set, the number of simultaneous connections will be much higher, since the default value of this property in .NET Core is int.MaxValue.

And now we will consider the third non-obvious problem with default settings, which can be no less critical than the previous two - long-lived connections and DNS caching . When a connection is established with a remote server, the domain name is first resolved into the corresponding IP address, then the resulting address is placed in the cache for some time in order to speed up subsequent connections. In addition, to save resources, most often the connection is not closed after each request is executed, but kept open for a long time.

Imagine that the system we are developing should work normally without a forced restart if the server with which it interacts has moved to another IP address. For example, in the case of switching to another data center due to a failure in the current one. Even if a permanent connection is lost due to a failure in the first data center (which may not be fast enough), the DNS cache will not allow our system to quickly respond to such a change. The same is true for referring to the address at which load balancing is done through DNS round-robin.

In the case of a “normal” .NET framework, this behavior can be controlled via ServicePointManager and ServicePoint (all the parameters below take values in milliseconds):

ServicePointManager.DnsRefreshTimeout - indicates how long the obtained IP address will be cached for each domain name, the default value is 2 minutes (120000).
ServicePoint.ConnectionLeaseTimeout - indicates how long a connection can be kept open. By default, there is no time limit for connections; any connection can be maintained indefinitely, since this parameter is equal to -1. Setting it to 0 will cause each connection to close immediately after the request is executed.
ServicePoint.MaxIdleTime - specifies how long the inactivity of the connection will be closed. Inactivity means no data transmission over the connection. The default value for this parameter is 100 seconds (100,000).

Now, to improve the understanding of these parameters, we combine them all in one scenario. Suppose that DnsRefreshTimeout and MaxIdleTime have not been changed and they are equal to 120 and 100 seconds, respectively. With this, ConnectionLeaseTimeout was set to 60 seconds. The application establishes only one connection, through which it sends requests every 10 seconds.

With such settings, the connection will be closed every 60 seconds (ConnectionLeaseTimeout), even though it periodically transfers data. Closing and re-creating will occur in such a way as not to interfere with the correct execution of requests - if the time has expired and the request is still being executed, the connection will be closed after the request is completed. Each time the connection is re-established, the corresponding IP address will first be taken from the cache, and only if the lifetime of its resolution has expired (120 seconds), the system will send a request to the DNS server.

The MaxIdleTime parameter in this scenario will not play a role, since the connection is not inactive for more than 10 seconds.

The optimal ratio of these parameters strongly depends on the specific situation and non-functional requirements:

If transparent switching of IP addresses behind the domain name addressed by your application is not supposed at all and at the same time it is necessary to minimize the cost of network connections, then the default settings look like a good option.
If there is a need to switch between IP addresses in case of failures, then you can set DnsRefreshTimeout to 0, and ConnectionLeaseTimeout to a non-negative value that suits you. What specifically - very much depends on how quickly you need to switch to another IP. Obviously, you want to have the fastest possible response to a failure, but here you need to find the optimal value, which, on the one hand, provides the allowable switching time, on the other hand, it does not degrade the throughput and system response time by too frequent re-creation of connections.
If you need a quicker response to IP address changes, for example, as in the case of balancing round-robin via DNS, you can try to set DnsRefreshTimeout and ConnectionLeaseTimeout to 0, but this will be extremely wasteful: for each request, the DNS server will be polled first, after which the connection to the target node will be re-established.
There may be situations where setting the ConnectionLeaseTimeout to 0 with a non-zero DnsRefreshTimeout may be useful, but I cannot go on with the appropriate script. Logically, this will mean that connections will be re-created for each request, but IP addresses will be taken from the cache whenever possible.

Below is an example of the code with which you can observe the behavior of the parameters described above:

var client = new HttpClient();
ServicePointManager.DnsRefreshTimeout = 120000;
var habrServicePoint = ServicePointManager.FindServicePoint(new Uri("https://habr.com"));
habrServicePoint.MaxIdleTime = 100000;
habrServicePoint.ConnectionLeaseTimeout = 60000;
while (true)
{
    client.GetAsync("https://habr.com").Wait();
    Thread.Sleep(10000);
}

While the test program is running, you can run netstat in a loop via PowerShell to monitor the connections that it establishes.

It should also be said how to manage the described parameters in .NET Core. Settings from ServicePointManager, as in the case of ConnectionLimit, will not work. Core has a special type of HTTP handler that implements two of the three parameters described above - SocketsHttpHandler:

var handler = new SocketsHttpHandler();
handler.PooledConnectionLifetime = TimeSpan.FromSeconds(60); //Аналог ConnectionLeaseTimeout
handler.PooledConnectionIdleTimeout = TimeSpan.FromSeconds(100); //Аналог MaxIdleTimevar client = new HttpClient(handler);

There is no parameter that controls the time to cache DNS records in .NET Core. Test examples show that caching does not work - when creating a new DNS connection, resolution is performed again, respectively, for normal operation in conditions where the requested domain name can switch between different IP addresses, just set PooledConnectionLifetime to the desired value.

On top of that, be sure to say that all these problems could not be overlooked by developers from Microsoft, and therefore, starting with .NET Core 2.1, there was a factory of HTTP clients that allows you to solve some of them - https://docs.microsoft.com/en- us / dotnet / standard / microservices-architecture / implement-resilient-applications / use-http-client-history-to-implement-resilient-http-requests. Moreover, in addition to managing the lifetime of connections, a new component provides opportunities for creating typed clients, as well as some other useful things. This article and links to it contain enough information and examples on using the HttpClientFactory, so in this article I will not consider the details associated with it.

Tags:

HttpClient Pitfalls in .NET

Also popular now: