How a web browser works (with pictures)

Short description

This article explains how web browsers work, starting with navigation and DNS lookup, followed by establishing a TCP connection and establishing a TLS connection if using HTTPS. Once a connection is established, the browser fetches server resources using HTTP requests and responses, and then parses HTML and CSS to create the Document Object Model (DOM) tree and the CSS Object Model (CSSOM) respectively. Finally, the browser executes JavaScript using the JavaScript engine and renders the page by building a render tree, layout, painting, and layering and composition. Every time there is a change in the DOM, the rendering process is repeated.

How a web browser works (with pictures)

Illustration courtesy of Growtika

Browsers have become a part of our daily life. But have you ever wondered how they actually work?

This article will unveil the magic that hides behind the scenes of web browsers.

Let’s get started!

1. Navigation

Navigation is the first step in loading a web page. It occurs when a user enters a URL in the address bar or clicks on a link.

Go to web page

DNS search

The first step is to find the IP address where the resources are hosted. This is done using a DNS lookup.

A DNS (Domain Name System) server is a server that is primarily used to map a website’s hostname (eg: www.example.com) to its corresponding IP address. It contains a database of public IP addresses and their corresponding domain names.

For example, if you visit www.example.com, the DNS server will return its IP address (93.184.216.34).

Triple TCP handshake

The next step is to establish a TCP connection with the server. This is done using a three-way TCP handshake.

Translator’s note. Transmission Control Protocol (TCP) is a transmission control protocol. It is a standard that defines how to establish and maintain network communication through which applications can exchange data.

Triple TCP handshake

  1. First, the client sends a request to open a connection with the server using a SYN packet (SYNchronize packet).

  2. Then the server responds with a SYN-ACK packet (from the English SYNchronize-ACKknowledge packet) to confirm the request and invite the client to open the connection.

  3. Finally, the client sends an ACK packet to the server to confirm the request.

TLS handshake

If the website uses HTTPS (Hypertext Transfer Protocol Secure), the next step is to establish a TLS connection via TLS handshake.

Translator’s note. TLS (Transport Layer Security) – encrypts data transmitted over the Internet so that eavesdropping devices and hackers cannot see what you are transmitting, which is especially useful for private and confidential information such as passwords , credit card numbers and personal correspondence.

TLS handshake

At this stage, several more messages are exchanged between the browser and the server.

  1. Client sends “hello”: the browser sends a message to the server with the TLS version and the set of ciphers it supports, as well as a string of random bytes called “client random”.

Translator’s note. “Client random” (as hereinafter “server random”) is a string of 32 random bytes. According to the TLS 1.2 specification, the first 4 bytes are the current date and time, and the remaining 28 bytes are randomly generated numbers.

  1. The server sends a “hello” and a certificate: in response to the client’s “hello”, the server sends its welcome message containing an SSL certificate, a set of ciphers and a string of random bytes called “server random”.

  2. Authentication: the browser checks the SSL certificate received from the server for authenticity through the “certificate authority” that issued it. In this way, the browser can understand that the server is really who it claims to be.

  3. “premaster secret” (premaster secret): the browser sends another string of randomly generated bytes, called the “premaster secret”, which is encrypted with a public key obtained from the server’s SSL certificate. The premaster’s secret can only be decrypted using the server’s private key.

  4. Using a private key: the server decrypts the premaster secret

  5. Creating session keys: browser and server generate session keys based on “client random”, “server random” and “premaster secret”.

  6. The client completes the handshake process: the browser sends a message to the server about the completion of the process.

  7. The server completes the handshake process: the server also sends a message to the client about the completion of the process from its side.

  8. Symmetric encryption security is achieved: the handshake is completed and further communication can continue using session keys.

Now you can start sending requests to the server and receiving data from it.

2. Fetching of resources

After establishing a TCP connection, the browser can start fetching (from the English ch. to fetch – to receive, download) server resources.

Resource fetching

HTTP request

If you already have experience in web development, you’ve probably come across the concept of HTTP requests.

HTTP requests are used for fetching server resources. For requests, you must specify the URL and the type of request (GET, POST, PUT, DELETE). The browser also adds headers to the request to provide additional information.

The first request to the server is usually a GET request to retrieve an HTML file.

HTTP response

The server then responds with an appropriate HTTP response. The response contains a status code, headers, and body.

3. HTML parsing

Now the main section begins. After the browser has received the HTML file, it parses it to create a DOM (Document Object Model) tree.

This is done with the help of the browser engine, which is its core (Ex. Gecko for Firefox, Webkit for Safari, Blink for Chrome, etc.).

Here is an example HTML file:

<!DOCTYPE html>
<html>
  <head>
    <title>Page Title</title>
  </head>
  <body>
    <p>Hello World!</p>
  </body>
</html>

Tokenization

The first step on the way to displaying a web page is to tokenize the HTML file. Tokenization – is the process of dividing a string of characters into significant chunks (from English chunk – piece) for the browser, called tokens (from English token – token).

Tokens are the basic building blocks of the DOM tree.

Tokenization

Building a DOM tree

Lexing (lexing) is the process of converting tokens into a tree-like structure called a DOM tree.

DOM tree – is a tree-like data structure that is nodes (from the English node – node) in an HTML document.

DOM tree

Note: if the page requires any external resources, they will be handled as follows:

  1. Unblocked resources are loaded in parallel (Ex.: pictures)

  2. Deferred resources are loaded in parallel, but are executed after the DOM tree has been constructed. Ex: scripts with the defer attribute and CSS files.

  3. Blocking resources are loaded and executed sequentially. Example: scripts WITHOUT the defer attribute.

4. Parsing CSS

After the DOM tree is built, the browser parses the CSS files to create the CSSOM (CSS Object Model).

This process is analogous to building a DOM tree using tokenization and CSSOM generation.

5. Execution of JavaScript

As mentioned earlier, if the page has a blocking script, it will be loaded and processed immediately, while the construction of the DOM tree will be delayed, or the script will be loaded and executed after the DOM tree has been fully built.

Regardless of when the script is executed, it will be processed by the JavaScript engine, which is similar to the browser engine and depends on which browser is being used.

JIT compilation

Compilation

Interpretation

Assuming that you are already familiar with the concept of interpreters and compilers, let’s talk about the JavaScript engine.

The JavaScript engine uses a hybrid compilation approach called JIT (from the English abbreviation Just in Time).

JIT compilation, unlike a compiled language such as C, where compilation is done in advance (that is, before the code is executed), is done just in time.

JIT compilation

6. Rendering

Finally, it’s time to render the page. For rendering, the browser uses the DOM tree and CSSOM.

Building a render tree

The first step is to build a render tree. The render tree is a subset of the DOM tree consisting of only the elements visible on the page.

layout

The next stage is the layout of the render tree. It is done by calculating the exact dimensions and position of each element in the rendering tree.

This stage happens every time we change something in the DOM that affects the layout of the page, even partially.

Examples of situations where the position of elements is recalculated:

  1. Add or remove elements from the DOM

  2. Resizing the browser window

  3. Changing the width, height, or position of an element

Painting

Finally, the browser decides which nodes should be visible and calculates their position in the viewport, after which it’s time to render them on the screen. This stage is also known as the rasterization stage, where the browser converts each element calculated in the layout stage into actual pixels on the screen.

Just like the layout stage, this stage happens every time we change the representation of an element in the DOM, even partially.

Examples of situations where redraw occurs:

  1. Changing the outline of an element

  2. Changing the transparency (opacity) or visibility (visibility) of the element

  3. Change background color element

Layering and composition

The final stage is the composition of layers. This is done by the browser to optimize the rendering process.

Compositing is a technique of dividing parts of a page into layers, drawing them, and then assembling a page from them in a separate thread, called a compositor thread. When sections of a document are drawn in different layers, overlapping each other, composition is necessary to ensure that they are drawn in the correct order and their content is displayed correctly.

Note: DOM updates, including layout and drawing, are extremely resource-intensive operations that are significantly noticeable on low-end devices. Therefore, it is important to minimize the number of their activations.

That’s all! 🎉

Thanks for reading.

Related posts