Web Application Caching and Performance
- Transfer
Caching allows you to increase the performance of web applications through the use of previously stored data, such as responses to network requests or calculation results. Thanks to the cache, the next time the client accesses the same data, the server can serve requests faster. Caching is an effective architectural pattern, since most programs often access the same data and instructions. This technology is present at all levels of computing systems. Processors, hard drives, servers, browsers have caches.
Nick Karnik, the author of the material whose translation we publish today, offers to talk about the role of caching in the performance of web applications, looking at caching tools at various levels, starting from the lowest. He pays special attention to where exactly the data can be cached, and not how it happens.
We believe that understanding the features of caching systems, each of which makes a certain contribution to the speed of the application’s response to external influences, will expand the horizons of the web developer and help him in creating fast and reliable systems.
Let's start our talk about caches from the lowest level - from the processor. The processor cache is a very fast memory that acts as a buffer between the processor (CPU) and random access memory (RAM). The cache stores data and instructions that are accessed most often, so the processor can access all of this almost instantly.
The processors have a special memory, represented by the processor registers, which usually is a small storage of information that provides an extremely high speed data exchange. Registers are the fastest memory that a processor can work with, which is located as close to its other mechanisms as possible and has a small amount. Sometimes registers are called zero-level cache (L0 Cache, L is short for Layer).
The processors, in addition, have access to several more levels of cache memory. This is up to four cache levels, which, respectively, are called first, second, third, and fourth level caches (L0 - L4 Cache). The level to which the processor registers belong, in particular, whether it will be a cache of zero or first level, is determined by the architecture of the processor and the motherboard. In addition, it depends on the architecture of the system where, on the processor, or on the motherboard, the cache memory of different levels is physically located.
Memory structure in some newer CPUs
Hard disks (HDD, Hard Disk Drive), used for permanent data storage - this, in comparison with the RAM, designed for short-term storage of information, devices are quite slow. However, it should be noted that the speed of permanent storage of information increases due to the proliferation of solid state drives (SSD, Solid State Drive).
In long-term information storage systems, a disk cache (also called a disk buffer or cache buffer) is the memory built into the hard disk that acts as a buffer between the processor and the physical hard disk.
Hard disk cache
Disk caches work on the assumption that when they write something to the disk or read something from it, there is a possibility that this data will be accessed again in the near future.
The difference between the temporary storage of data in RAM and permanent storage on the hard disk is manifested in the speed of working with information, in the cost of media and in their proximity to the processor.
The response time of RAM is tens of nanoseconds, while the hard drive needs tens of milliseconds. The difference in speed of disks and memory is six orders of magnitude!
One millisecond equals one million nanoseconds
Now that we have discussed the role of caching in the basic mechanisms of computer systems, consider an example that illustrates the use of caching concepts in the interaction of a client represented by a web browser and a server that, in response to client requests, sends it some data. At the very beginning, we have a simple web server, which, responding to a client request, reads data from the hard drive. At the same time, imagine that there are no special caching systems between the client and server. Here is how it looks.
A simple web server
When the system described above works, when the client accesses the server directly, while the client processes the request on its own, reads data from the hard disk and sends it to the client, it still cannot do without a cache, since its buffer will be used when working with the disk .
At the first request, the hard drive will check the cache, in which, in this case, there will be nothing, which will lead to the so-called "cache miss". Then the data is read from the disk itself and gets into its cache, which corresponds to the assumption that this data may be needed again.
In subsequent requests aimed at obtaining the same data, a cache search will be successful, this is the so-called "hit cache". The data in response to the request will come from the disk buffer until it is overwritten, which, when accessing the same data again, will lead to a cache miss.
Let's complicate our example, add the database here. Database queries can be slow and require serious system resources, since the database server needs to perform some calculations to form a response. If the requests are repeated, caching them with the database tools will help reduce its response time. In addition, caching is useful in situations where several computers work with the database, performing the same queries.
A simple web server with a database
Most database servers are configured by default with optimal caching parameters. However, there are many settings that can be modified so that the database subsystem better matches the features of a particular application.
We will continue to develop our example. Now the web server, previously considered as a single entity, is divided into two parts. One of them, the web server itself, is now engaged in interacting with clients and with a server application that already works with storage systems. The web server can be configured to cache responses, as a result, it will not have to constantly send similar requests to the server application. Similarly, the main application can cache some parts of its own responses to resource-intensive database queries or to frequently encountered file queries.
Response Cache and Application Cache
Web server responses are cached in RAM. An application cache can be stored either locally, in memory, or on a special caching server that uses a database like Redis, which stores data in RAM.
Now let's talk about optimizing the performance of the server application due to memoization. This is a kind of caching used to optimize work with resource-intensive functions. This technique allows you to perform a full cycle of calculations for a certain set of input data only once, and with the next calls to the function with the same input data, immediately return the result found earlier. Memoization is implemented through so-called “lookup tables” that store keys and values. The keys correspond to the input data of the function, the values correspond to the results that the function returns when this input is passed to it.
Memoizing a function using a lookup table
Memoizing is a common technique used to improve program performance. However, it may not be particularly useful when working with resource-intensive functions that are rarely called, or with functions that, even without memoization, work quite quickly.
Now let's move on to the client side and talk about caching in browsers. Each browser has an implementation of the HTTP cache (also called the web cache), which is designed to temporarily store materials received from the Internet, such as HTML pages, JavaScript files and images.
This cache is used when the server response contains correctly configured HTTP headers that tell the browser when and for how long it can cache the server response.
We have before us a very useful technology that gives the following advantages to all participants in the exchange of data:
Browser Caching
In computer networks, proxies can be represented by special hardware or related applications. They act as intermediaries between clients and servers that store the data that these clients require. Caching is one of the tasks that they solve. Consider the different types of proxies.
A gateway is a proxy server that redirects incoming requests or outgoing responses without modifying them. Such proxies are also called tunneling proxies, web proxies, proxies, or application level proxies. These proxies are usually shared, for example, by all clients located behind the same firewall, which makes them well suited for caching requests.
A direct proxy server (forward proxy, often referred to simply as a proxy server) is usually installed on the client side. A web browser that is configured to use a direct proxy server will send outgoing requests to this server. Then these requests will be redirected to the target server located on the Internet. One of the advantages of direct proxies is that they protect client data (however, if we talk about ensuring anonymity on the Internet, it will be safer to use a VPN).
Web accelerator (web accelerator) - a proxy server that reduces the access time to the site. He does this by pre-requesting documents from the server that are likely to be needed by clients in the near future. Such servers, in addition, can compress documents, speed up encryption operations, reduce the quality and size of images, and so on.
A reverse proxy is usually a server located in the same place as the web server with which it interacts. Reverse proxies are designed to prevent direct access to servers located on private networks. Reverse proxies are used to balance the load between several internal servers and provide SSL authentication or request caching capabilities. Such proxies perform server-side caching, they help the main servers in processing a large number of requests.
Reverse proxies are located close to the servers. There is also a technology in which caching servers are located as close as possible to data consumers. This is the so-called edge caching, represented by Content Delivery Networks (CDNs). For example, if you visit a popular website and download some kind of static data, it gets cached. Each next user who requests the same data will receive it, before the expiration of their caching period, from the caching server. These servers, determining the relevance of information, focus on servers that store the source data.
Proxies in the infrastructure of data exchange between the client and server
In this article, we examined the various levels of data caching that are used in the process of exchanging information between the client and server. Web applications cannot instantly respond to user interactions, which, in particular, is associated with actions requiring data exchange with the servers of these applications, with the need to perform certain calculations before sending a response. The time required to transfer data from the server to the client includes the time required to search for the necessary data on the disk, network delays, processing of request queues, mechanisms for regulating network bandwidth, and much more. If we take into account that all this can happen on many computers located between the client and the server, then we can say that all these delays can seriously increase the time,
A properly configured caching system can significantly improve overall server performance. Caches reduce delays that inevitably occur during data transfer over the network, help save network traffic, and, as a result, reduce the time required for the browser to display the requested web page from the server.
Dear readers! What caching technologies do you use in your projects?
Nick Karnik, the author of the material whose translation we publish today, offers to talk about the role of caching in the performance of web applications, looking at caching tools at various levels, starting from the lowest. He pays special attention to where exactly the data can be cached, and not how it happens.
We believe that understanding the features of caching systems, each of which makes a certain contribution to the speed of the application’s response to external influences, will expand the horizons of the web developer and help him in creating fast and reliable systems.
CPU cache
Let's start our talk about caches from the lowest level - from the processor. The processor cache is a very fast memory that acts as a buffer between the processor (CPU) and random access memory (RAM). The cache stores data and instructions that are accessed most often, so the processor can access all of this almost instantly.
The processors have a special memory, represented by the processor registers, which usually is a small storage of information that provides an extremely high speed data exchange. Registers are the fastest memory that a processor can work with, which is located as close to its other mechanisms as possible and has a small amount. Sometimes registers are called zero-level cache (L0 Cache, L is short for Layer).
The processors, in addition, have access to several more levels of cache memory. This is up to four cache levels, which, respectively, are called first, second, third, and fourth level caches (L0 - L4 Cache). The level to which the processor registers belong, in particular, whether it will be a cache of zero or first level, is determined by the architecture of the processor and the motherboard. In addition, it depends on the architecture of the system where, on the processor, or on the motherboard, the cache memory of different levels is physically located.
Memory structure in some newer CPUs
Hard drive cache
Hard disks (HDD, Hard Disk Drive), used for permanent data storage - this, in comparison with the RAM, designed for short-term storage of information, devices are quite slow. However, it should be noted that the speed of permanent storage of information increases due to the proliferation of solid state drives (SSD, Solid State Drive).
In long-term information storage systems, a disk cache (also called a disk buffer or cache buffer) is the memory built into the hard disk that acts as a buffer between the processor and the physical hard disk.
Hard disk cache
Disk caches work on the assumption that when they write something to the disk or read something from it, there is a possibility that this data will be accessed again in the near future.
About the speed of hard drives and RAM
The difference between the temporary storage of data in RAM and permanent storage on the hard disk is manifested in the speed of working with information, in the cost of media and in their proximity to the processor.
The response time of RAM is tens of nanoseconds, while the hard drive needs tens of milliseconds. The difference in speed of disks and memory is six orders of magnitude!
One millisecond equals one million nanoseconds
Simple web server
Now that we have discussed the role of caching in the basic mechanisms of computer systems, consider an example that illustrates the use of caching concepts in the interaction of a client represented by a web browser and a server that, in response to client requests, sends it some data. At the very beginning, we have a simple web server, which, responding to a client request, reads data from the hard drive. At the same time, imagine that there are no special caching systems between the client and server. Here is how it looks.
A simple web server
When the system described above works, when the client accesses the server directly, while the client processes the request on its own, reads data from the hard disk and sends it to the client, it still cannot do without a cache, since its buffer will be used when working with the disk .
At the first request, the hard drive will check the cache, in which, in this case, there will be nothing, which will lead to the so-called "cache miss". Then the data is read from the disk itself and gets into its cache, which corresponds to the assumption that this data may be needed again.
In subsequent requests aimed at obtaining the same data, a cache search will be successful, this is the so-called "hit cache". The data in response to the request will come from the disk buffer until it is overwritten, which, when accessing the same data again, will lead to a cache miss.
Database caching
Let's complicate our example, add the database here. Database queries can be slow and require serious system resources, since the database server needs to perform some calculations to form a response. If the requests are repeated, caching them with the database tools will help reduce its response time. In addition, caching is useful in situations where several computers work with the database, performing the same queries.
A simple web server with a database
Most database servers are configured by default with optimal caching parameters. However, there are many settings that can be modified so that the database subsystem better matches the features of a particular application.
Web server response caching
We will continue to develop our example. Now the web server, previously considered as a single entity, is divided into two parts. One of them, the web server itself, is now engaged in interacting with clients and with a server application that already works with storage systems. The web server can be configured to cache responses, as a result, it will not have to constantly send similar requests to the server application. Similarly, the main application can cache some parts of its own responses to resource-intensive database queries or to frequently encountered file queries.
Response Cache and Application Cache
Web server responses are cached in RAM. An application cache can be stored either locally, in memory, or on a special caching server that uses a database like Redis, which stores data in RAM.
Memoization of functions
Now let's talk about optimizing the performance of the server application due to memoization. This is a kind of caching used to optimize work with resource-intensive functions. This technique allows you to perform a full cycle of calculations for a certain set of input data only once, and with the next calls to the function with the same input data, immediately return the result found earlier. Memoization is implemented through so-called “lookup tables” that store keys and values. The keys correspond to the input data of the function, the values correspond to the results that the function returns when this input is passed to it.
Memoizing a function using a lookup table
Memoizing is a common technique used to improve program performance. However, it may not be particularly useful when working with resource-intensive functions that are rarely called, or with functions that, even without memoization, work quite quickly.
Browser Caching
Now let's move on to the client side and talk about caching in browsers. Each browser has an implementation of the HTTP cache (also called the web cache), which is designed to temporarily store materials received from the Internet, such as HTML pages, JavaScript files and images.
This cache is used when the server response contains correctly configured HTTP headers that tell the browser when and for how long it can cache the server response.
We have before us a very useful technology that gives the following advantages to all participants in the exchange of data:
- The user’s experience of working with the site improves, since resources from the local cache load very quickly. While receiving a response, the signal transit time from the client to the server and back (RTT, Round Trip Time) is not included, since the request does not go to the network.
- The load on the server application and other server components responsible for processing requests is reduced.
- Some part of network resources is freed up, which other Internet users can now use, saving money on paying for traffic.
Browser Caching
Caching and proxies
In computer networks, proxies can be represented by special hardware or related applications. They act as intermediaries between clients and servers that store the data that these clients require. Caching is one of the tasks that they solve. Consider the different types of proxies.
▍ Gateways
A gateway is a proxy server that redirects incoming requests or outgoing responses without modifying them. Such proxies are also called tunneling proxies, web proxies, proxies, or application level proxies. These proxies are usually shared, for example, by all clients located behind the same firewall, which makes them well suited for caching requests.
▍ Direct proxies
A direct proxy server (forward proxy, often referred to simply as a proxy server) is usually installed on the client side. A web browser that is configured to use a direct proxy server will send outgoing requests to this server. Then these requests will be redirected to the target server located on the Internet. One of the advantages of direct proxies is that they protect client data (however, if we talk about ensuring anonymity on the Internet, it will be safer to use a VPN).
▍Web boosters
Web accelerator (web accelerator) - a proxy server that reduces the access time to the site. He does this by pre-requesting documents from the server that are likely to be needed by clients in the near future. Such servers, in addition, can compress documents, speed up encryption operations, reduce the quality and size of images, and so on.
▍ Reverse proxies
A reverse proxy is usually a server located in the same place as the web server with which it interacts. Reverse proxies are designed to prevent direct access to servers located on private networks. Reverse proxies are used to balance the load between several internal servers and provide SSL authentication or request caching capabilities. Such proxies perform server-side caching, they help the main servers in processing a large number of requests.
▍ Limit caching
Reverse proxies are located close to the servers. There is also a technology in which caching servers are located as close as possible to data consumers. This is the so-called edge caching, represented by Content Delivery Networks (CDNs). For example, if you visit a popular website and download some kind of static data, it gets cached. Each next user who requests the same data will receive it, before the expiration of their caching period, from the caching server. These servers, determining the relevance of information, focus on servers that store the source data.
Proxies in the infrastructure of data exchange between the client and server
Summary
In this article, we examined the various levels of data caching that are used in the process of exchanging information between the client and server. Web applications cannot instantly respond to user interactions, which, in particular, is associated with actions requiring data exchange with the servers of these applications, with the need to perform certain calculations before sending a response. The time required to transfer data from the server to the client includes the time required to search for the necessary data on the disk, network delays, processing of request queues, mechanisms for regulating network bandwidth, and much more. If we take into account that all this can happen on many computers located between the client and the server, then we can say that all these delays can seriously increase the time,
A properly configured caching system can significantly improve overall server performance. Caches reduce delays that inevitably occur during data transfer over the network, help save network traffic, and, as a result, reduce the time required for the browser to display the requested web page from the server.
Dear readers! What caching technologies do you use in your projects?