HTTP / 1.1 and HTTP / 2: What’s the difference?

Tram Ho

The day before in a client meeting, the upgrade from HTTP / 1.1 to HTTP / 2 was discussed. Since I have no knowledge of this section, I researched and found a very useful document, so I decided to translate and post it on viblo.


Hypertext Transfer Protocol, or HTTP, is an application protocol that has become the de facto standard for communication on the World Wide Web since its invention in 1989. From the release of HTTP / 1.1 in 1997 until almost. Here, there have been several revisions to the protocol. But in 2015, a reimagined version called HTTP / 2 was introduced, providing a number of methods to reduce latency, especially when dealing with mobile, graphics, and platforms. The video uses multiple servers. HTTP / 2 has since grown in popularity, with some estimates suggesting that about a third of websites in the world support it. In this changing landscape, web developers can benefit from understanding the technical differences between HTTP / 1.1 and HTTP / 2, allowing them to make informed and efficient decisions about delivery. develop best practices.

After reading this article, you will understand the key differences between HTTP / 1.1 and HTTP / 2, focusing on the technical changes HTTP / 2 has applied to achieve a more efficient Web protocol.


To contextualize the specific changes HTTP / 2 has made to HTTP / 1.1, let’s first look at the historical development and basic behavior of each.

HTTP / 1.1

Developed by Timothy Berners-Lee in 1989 as a communication standard for the World Wide Web, HTTP is a top-level application protocol that exchanges information between a client and a local or remote web server. . During this process, a client sends a text-based request to the server by calling a method like GET or POST. In response, the server sends a resource such as the HTML page back to the client.

For example, let’s say you’re on a website at the domain . When you navigate to this URL, your computer’s web browser sends an HTTP request as a text-based message, similar to the one shown here:

This request uses the GET method, which requests data from the host listed after the Host: cluster Host: In response to this request, the web server returns an HTML page for the requesting client, along with any images, stylesheets, or other resources called in the HTML. Note that not all resources are returned to the client on the first data call. Requests and responses will go back and forth between the server and the client until the web browser has received all the resources needed to display the content of the HTML page on your screen.

You can think of this exchange of requests and responses as a single application layer of the Internet protocol stack, located at the top of the Transport layer (usually using the Transmission Control Protocol or TCP) and Layer. Network (Networking layer) (using Internet Protocol or IP):

There is a lot to discuss about the lower levels of this stack, but to get a high level understanding of HTTP / 2 you only need to know this abstract class model and where HTTP is included.

With this basic HTTP / 1.1 overview, we can now move on to recount the early evolution of HTTP / 2.

HTTP / 2

HTTP / 2 started out as the SPDY protocol, developed primarily at Google with the aim of reducing website loading latency using techniques like compression, multiplexing, and prioritization. . This protocol served as the template for HTTP / 2 when IETF’s Hypertext Transfer Protocol httpbis (Internet Engineering Task Force) workgroup set standards together, culminating in the publication of HTTP / 2 in May. 2015. From the outset, many browsers supported this standardization effort, including Chrome, Opera, Internet Explorer and Safari. Due in part to this browser support, there has been a significant protocol adoption rate since 2015, with an exceptionally high rate among new websites.

From a technical point of view, one of the most important features distinguishing HTTP / 1.1 and HTTP / 2 is the Binary Framing layer, which can be considered as part of the application layer in the Internet Protocol Stack. As opposed to HTTP / 1.1, which keeps all requests and responses in plain text format, HTTP / 2 uses the Binary framing layer to encapsulate all messages in binary format, while maintaining the language. HTTP meanings, such as verbs, methods, and headers. An application-level API will still generate messages in regular HTTP formats, but the underlying layer will then convert these messages to binary. This ensures that web applications created before HTTP / 2 can continue to function properly when interacting with the new protocol.

Converting message to binary allows HTTP / 2 to try new approaches to data delivery not in HTTP / 1.1, a contrast that is at the root of the actual difference between the two protocols. The next section will look at the delivery model of HTTP / 1.1, followed by what new model is generated by HTTP / 2.

Delivery Models

As mentioned in the previous section, HTTP / 1.1 and HTTP / 2 share semantics, ensuring that requests and responses passed between the server and the client in both protocols reach their destination as the intended message. Traditional form with title and content, using familiar methods like GET and POST. But while HTTP / 1.1 converts these as plain text messages, HTTP / 2 encodes these into binary, allowing for significantly different delivery model possibilities. In this section, we will first briefly look at how HTTP / 1.1 attempts to optimize efficiency with its delivery model and the problems that arise from this, followed by the advantages of Binary. The framing layer of HTTP / 2 and describes how it prioritizes requests.

HTTP / 1.1 – Pipelining and Head-of-Line Block

The first response the client receives on an HTTP GET request is usually not the fully rendered page. Instead, it contains links to the additional resources needed for the requested page. The client discovers that rendering the page in full requires these additional resources from the server only after the server has downloaded the page. Therefore, the client will have to make additional requests to get these resources. In HTTP / 1.0, the client had to disconnect and re-create the TCP connection with each new request, which was both time and resource consuming.

HTTP / 1.1 solves this problem by introducing persistent connections and pipelining. For persistent connections, HTTP / 1.1 assumes that the TCP connection must be kept open unless the request is directly closed. This allows the client to send multiple requests along the same connection without waiting for a response for each request, significantly improving HTTP / 1.1 performance over HTTP / 1.0.

Unfortunately, there is a natural bottleneck to this optimization strategy. Since multiple packets cannot pass each other when going to the same destination, there are situations where a request at the top of the queue cannot access its necessary resources will block all requests behind. it. This is known as head-of-line (HOL) blocking and is a serious problem with optimizing the connection efficiency in HTTP / 1.1. Adding separate, parallel TCP connections can alleviate this problem, but is limited to the number of concurrent TCP connections possible between the client and the server, and each new connection requires resources. significantly.

These issues are always on the mind of the HTTP / 2 developers who have suggested using the aforementioned Binary Framing layer to fix these problems, a topic you will learn more about in the next section. .

HTTP / 2 – The Advantages of the Binary Framing Layer

In HTTP / 2, the Binary Framing layer encrypts the requests / responses and cuts them into smaller packets of information, greatly increasing the flexibility of the data transmission.

Let’s take a closer look at how this works. In contrast to HTTP / 1.1, which uses multiple TCP connections to reduce the effect of HOL blocking, HTTP / 2 establishes a single connection object between two machines. In this connection there are many data streams. Each thread contains many messages in familiar request / response format. Finally, each of these messages is broken down into smaller units known as frames:

At the most granular level, the communication channel consists of a series of binary encoded frames, each tagged for a particular thread. The identification tags allow the frames to be interconnected during transmission and reassembled at the other end. Alternate requests and responses can run in parallel without blocking the messages behind them, a process known as multiplexing. Multiplexing solves the bullet-blocking problem in HTTP / 1.1 by ensuring that no message waits for another message to finish. This also means that the server and client can send requests and responses simultaneously, allowing for better control and more efficient connection management.

Because multiplexing allows the client to build multiple parallel streams, these streams only need to use a single TCP connection. Having a single persistent connection per origin improves HTTP / 1.1 based improvements by reducing memory and trace handling across the network. This results in better network and bandwidth utilization and thus reduces overall operating costs.

A single TCP connection also improves the performance of the HTTPS protocol, as the client and the server can reuse the same secure session for multiple requests / responses. In HTTPS, during a TLS or SSL handshake, both parties agree to use a unique key throughout the session. If the connection is disconnected, a new session will start, requesting a new key to be created for further communication. Therefore, maintaining a single connection can dramatically reduce the resources required for HTTPS performance. Note that, although the HTTP / 2 specification doesn’t require the use of the TLS layer, many major browsers only support HTTP / 2 with HTTPS.

Although multiplexing inherent in the binary framework layer solves certain HTTP / 1.1 problems, multiple threads waiting for the same resource can still cause performance issues. However, the design of HTTP / 2 takes this into account by using thread priority, a topic we will discuss in the next section.

HTTP / 2 – Stream priority

Thread priority not only solves the probable problem of competing requests for the same resource, but also allows developers to customize the relative weight of requests to optimize application performance. better. In this section, we will break down this prioritization process to provide more detailed information about how you can take advantage of this feature of HTTP / 2.

As you know by now, the Binary framing layer organizes messages into parallel data streams. When a client sends concurrent requests to a server, it can prioritize the responses it is requesting by assigning weights between 1 and 256 for each thread. A higher number indicates a higher priority. In addition, the client also specifies the dependency of each thread on another thread by specifying the thread ID on which the thread depends. If the parent identifier is omitted, the thread is considered dependent on the original thread. This is illustrated in the following figure:

In the illustration, the channel contains six threads, each with a unique ID and associated with a specific weight. Thread 1 does not have the parent ID associated with it and is by default associated with the root node. All other threads have a marked root ID number. The resource allocation for each thread will be based on the weight they hold and the dependent factors they require. For example, threads 5 and 6, in the figure already assigned the same weight and the same parent thread, will have the same priority for resource allocation.

The server uses this information to create a dependency tree, allowing the server to determine the order in which requests will retrieve their data. Based on the streams in the previous figure, the dependency tree would look like this:

In this dependency tree, thread 1 depends on the root thread and no other thread comes from the root, so all available resources are allocated to thread 1 before the other threads. Since the tree indicates that thread 2 depends on the completion of thread 1, thread 2 will not continue until the completion of thread 1 task. Now, let’s look at thread 3 and 4. Both threads this is all dependent on thread 2. As in the case of thread 1, thread 2 will get all the resources available before thread 3 and 4. After thread 2 completes its task, threads 3 and 4 will get resources; they are divided on a 2: 4 ratio as indicated by their weights, resulting in a higher resource share for thread 4. Finally, when thread 3 ends, threads 5 and 6 will get available resources. into equal parts. This can happen before thread 4 completes its task, even though thread 4 receives a higher amount of resources; lower level threads are allowed to start as soon as higher level dependencies have ended.

As an application developer, you can place a weight in your requirement based on your needs. For example, you can specify a lower priority to load high-resolution images after serving the thumbnail image on the website. By providing this means of weight assignment, HTTP / 2 gives developers greater control over web page rendering. The protocol also allows the client to change dependencies and reallocate the weights at run time in response to user interaction. However, it’s important to note that the server can manually change its specified preferences if a certain thread is blocked from accessing a particular resource.

Memory overflow

In any TCP connection between two hosts, both the client and the server have a certain amount of buffering available to hold unprocessed incoming requests. These buffers provide the flexibility to deal with many particularly large requests or requests, in addition to the uneven speeds of upstream and upstream connections.

However, there are situations in which the buffer is not sufficient. For example, the server might be pushing large amounts of data at a rate that the client cannot cope with due to limited buffer size or lower bandwidth. Likewise, when a client uploads a large image or a video to the server, the server cache can overflow, losing some additional packages.

To avoid buffer overflows, flow control must prevent the sender from overwhelming the receiver’s data. This section will provide an overview of how HTTP / 1.1 and HTTP / 2 use different versions of the mechanism to solve the problem of flow control according to their different distribution models.

HTTP / 1.1

In HTTP / 1.1, the flow control relies on the underlying TCP connection. When this connection is initiated, both the client and the server set their buffer sizes using the system default settings. If the receiver’s buffer partially fills the data, it informs the sender of its receive window, ie the amount of available space remains in its buffer. This receive window is sent in a signal called the ACK packet, which is the packet the receiver sends to confirm that it has received the open signal. If this received window size is 0, the sender will send no more data until the client clears the internal buffer and then requests to continue data transmission. It is important to note here that using the TCP connection-based receive windows below it is possible to deploy flow control only at either end of the connection.

Because HTTP / 1.1 relies on the transport layer to avoid buffer overflows, each new TCP connection requires a separate flow control mechanism. HTTP / 2, however, multiplexes in a single TCP connection and will have to do flow control in a different way.

HTTP / 2

HTTP / 2 concatenates data streams in a TCP connection. Therefore, the receive windows at the TCP connection level are not sufficient to accommodate the distribution of individual streams. HTTP / 2 solves this problem by allowing the client and the server to deploy their own flow controls, rather than relying on the Transport layer. The Application layer communicates with the available buffer space, allowing the client and the server to set the receive window at the level of the multiplexed streams. This small-scale flow control can be modified or maintained after the original connection via the WINDOW_UPDATE framework.

Since this method controls the data flow at the Application Layer level, the flow control mechanism does not have to wait for the signal to reach the final destination before modifying the receive window. Intermediate nodes can use the flow control setting information to define their own resource allocation and modify it accordingly. In this way, each intermediate server can implement its own custom resource strategy, allowing for higher connection efficiency.

This flexibility in flow control can be beneficial when creating appropriate resource strategies. For example, the client can fetch the first scan of an image, show it to the user, and allow the user to preview it while fetching more important resources. After the client fetches these important resources, the browser will resume retrieving the rest of the image. Therefore, delaying the deployment of flow control to the client and server can improve the cognitive performance of the web application.

Regarding the flow control and thread priority mentioned in the previous section, HTTP / 2 provides a more granular level of control, opening up higher optimization possibilities. The next section will explain another unique method for the protocol that can strengthen the connection in a similar way: predict resource requests with server spur.

Predicting Resource requests

In a typical web application, the client sends a GET request and receives a page in HTML, usually the page’s index page. While examining the index page content, the client may discover that it needs to fetch additional resources, such as CSS and JavaScript files, to render the full page. The client determines that it only needs these additional resources after it receives the response from the original GET request and therefore must make additional requests to fetch these resources and complete the pagination. together. These additional requirements ultimately increase connection load times.

However, there are solutions to this problem: since the server knows in advance that the client will request additional files, the server can save the client’s time by sending these resources to the client before it asks. HTTP / 1.1 and HTTP / 2 have different strategies for doing this, each described in the next section.

HTTP / 1.1 – Resource inlining

In HTTP / 1.1, if the developer knows in advance what resources the client will need to render the page, they can use a technique called resource inlining to include the required resource directly in the resource. whether the HTML the server sent in response to the original GET request. For example, if a customer needs a specific CSS file to render a page, that CSS file inline gives the client the resources it needs before it requests, reducing the total number of requests the client has to. to send.

But there are a few problems with resource inlining. Including a resource in an HTML document is a viable solution for smaller, text-based resources, but larger files in a non-text format can significantly increase the size of the HTML document. This may ultimately reduce connection speed and disable the advantage initially gained from using this technique. Also, since inline resources are no longer separate from the HTML document, there is no mechanism for the client to either deny the resource it already has or place a resource in its cache. If multiple pages require a resource, then each new HTML document will have the same resource inline in its code, resulting in larger HTML documents and longer load times than when the resource was only saved. in the initial cache.

Therefore, a major disadvantage of inlining resources is that the client cannot separate the resource from the document. A better level of control is needed to optimize the connection, a need that HTTP / 2 seeks to respond to with server push.

HTTP / 2 – Server Push

In HTTP / 2, this process starts when the server sends a PUSH_PROMISE frame to notify the client that it will push a resource. This frame only includes the subject of the message and allows the client to know in advance what resources the server will push. If it already has a cached resource, the client can deny the push by sending a RST_STREAM frame in response. The PUSH_PROMISE framework also prevents the client from sending a duplicate request to the server, since it knows which resource the server will push.

It’s important to note here that the emphasis of server push is on client control. If a client needs to adjust the priority of server push or even disable it, it can send the INSTALL framework at any time to modify this HTTP / 2 feature.

While this feature has a lot of potential, server push isn’t always the answer to optimizing your web application. For example, some web browsers cannot always cancel pushed requests, even if the client already has a cached resource. If the client mistakenly allows the server to send duplicate resources, the push server may use up the connection unnecessarily. Finally, server push should be used at the developer’s discretion. For more about using server push strategy and optimizing web applications, see the PRPL pattern developed by Google. To learn more about possible server push issues, see Jake Archibald’s blog post The HTTP / 2 push is harder than I thought it would be.

Data Compression

A popular method for optimizing web applications is to use compression algorithms to reduce the size of HTTP messages transmitted between the client and the server. HTTP / 1.1 and HTTP / 2 both use this strategy, but there are implementation issues in the previous strategy of banning full message compression. The following section will discuss why this is the case and how HTTP / 2 can come up with a solution.

HTTP / 1.1

Programs like gzip have long been used to compress the data sent in HTTP messages, especially to reduce the size of CSS and JavaScript files. However, the header element of the message is always sent as plain text. Although each header is quite small, this burden of uncompressed data increasingly burdens the connection as more requests are made, especially for demanding, complex, API-heavy web applications. many different resources, and therefore many different resource requirements. Additionally, the use of cookies can sometimes make the header much larger, increasing the need for some sort of compression.

To deal with this bottleneck, HTTP / 2 uses HPACK compression to shrink the size of the header, which is discussed in more detail in the next section.

HTTP / 2

One of the themes that has appeared many times in HTTP / 2 is the ability to use the Binary framing layer to demonstrate greater control over fine detail. The same is true when it comes to header compression. HTTP / 2 có thể tách header khỏi dữ liệu của chúng, dẫn đến một khung header và một khung dữ liệu. Sau đó, chương trình nén HTTP / 2 cụ thể HPACK có thể nén khung header này. Thuật toán này có thể mã hóa metadata tiêu đề bằng mã hóa Huffman, do đó làm giảm đáng kể kích thước của nó. Ngoài ra, HPACK có thể theo dõi các trường siêu dữ liệu đã truyền tải trước đó và nén thêm chúng theo một chỉ mục được thay đổi động được chia sẻ giữa máy khách và máy chủ. Ví dụ: lấy hai yêu cầu sau:


Tài liệu tham khảo

Share the news now

Source : Viblo