Skip to content

How do browsers work? The science behind rendering a webpage

  • by
  • 13 min read

When it comes to software which is both powerful and versatile, nothing comes close to a web browser. Be it an Intel machine running x86 architecture or a smartphone which uses ARM microcode; web browsers offer phenomenal performance on any hardware you use. They are so powerful that they can replace a full-blown operating system, and the Chrome OS is a prime example of this.

Browsers are a work of art but have you ever wondered what goes on behind the scenes; the entire process of you entering a query and the browser returning the result? Well, in this article, we will be looking at how a browser works and how it renders webpages in a matter of seconds.

It all begins with requests and the Networking layer

When you visit a website on the Internet, all your browser does is connect to a remote computer (webserver) and request the resources to paint the page. This might look trivial, but under the hood, your browser is crunching millions of numbers to find and render the website on your screen.

To render a webpage, the first thing your browser needs to do is find the remote server that hosts the website. To do this, it tries to find the IP address of the URL you entered in the address bar. This IP address can uniquely identify the web server, and once the browser has this address, it can make requests to the server to get data.

How do browsers work? The science behind rendering a webpage
An overview of DNS resolution

To find the IP address the browser performs DNS resolution, which can only be done in two ways. It can either look into your browser’s cache memory which could hold the IP address of a URL if you have visited the site in the past. If that is not the case, then it requests your ISP, Google or Cloudflare to find the IP address for a particular website using their DNS servers.

Once your browser has the IP address of the website you are looking for, the networking layer of your browser gets to work. It tries to create a connection between your device and the server so that data can transfer between the two devices. To create this connection, the networking layer uses sockets, which is a way of connecting two devices on a network using their IP address and a designated port on each device.

Now that the networking layer has connected the two devices and data packets can transmit between them, the network layer starts performing the next most important task in any communication on the Internet, encryption.

To encrypt the data, the networking layer performs a TLS handshake between the two communicating devices. Once the handshake is complete, all the data travelling between the devices is encrypted and cannot be read by any third party.

The TLS handshake is only performed when data is transferred using the HTTPS protocol, and in the case of HTTP, only the TCP handshake is performed. This does not encrypt data; therefore you should never submit sensitive data over an HTTP connection as any malicious entity can see your data

After setting up a communication channel between the two devices, the networking layer sends a request to the server for the resources. In case of a webpage, this is an HTTPS/HTTP request, which asks the webserver to send an HTML file that contains all the information a browser needs to render a webpage. Once the server receives the request, it sends an HTML document to the browser in the form of ones and zeros over the communication channel, which has been established by the networking layer.

Network layer sends requests to servers and gives data to the rendering engine

Finally, the browser has the resources it needs to render a webpage, but they are in the form of bytes and need to be converted into a format, which looks like a webpage. To do this, the browser uses its rendering engine.

Also read: How does Netflix work? The science behind the play button

Getting meaning out of bits using the Rendering Engine

Now that the networking layer has made requests to the webserver and received all the data the browser needs, the rendering engine comes into the picture.

The main job of the rendering engine is to translate the bits of data into a form that can be used by the browser to create a webpage. To understand how the rendering engine works, it is essential to understand all the parts that make up a website.

  • HTML (Hyper Text Markup Language) is used to define the structure of a webpage.
  • CSS (Cascading Style Sheets) is used to direct the browser on how each element on the website is supposed to look.
  • Javascript is used to add interactivity to the site and is used to handle user inputs, clicks or any processing the website may need.

The rendering engine uses parsers to convert bits of data into meaningful information which can be used by the browser to render a webpage. The rendering engine has two different parsers, one for HTML and one for CSS. Let’s look at how the HTML parser works to get an idea of the parsing process.

Also read: What is Captcha? What is the difference between ReCaptcha v1, v2 and v3?

HTML parsing

The HTML parser takes bits of data as input and creates a logical representation of the HTML document in the memory of the device. This logical representation of data is known as the DOM structure and represents the HTML data in a hierarchical manner.

To create the DOM structure, the HTML parser performs several steps that can be described as follows

  • Characterisation extracts the characters from the bytes of information that the HTML parser gets from the network layer.
  • Tokenization finds the tokens in a stream of characters that help the browser in determining the structure of the data.
  • Node creation after identifying the tokens and the information contained in them, the browser creates memory nodes to hold this data.
  • DOM creation the parser hierarchically links the memory nodes to create a DOM representation of received bytes of data.

The HTML document that the browser receives contain links to CSS files. These links are processed by the networking layer and sent to the CSS parser. This parser creates a CSSOM (CSS Object Model) output, which defines how each element in the DOM is supposed to be styled.

Also read: How does Shazam work?

Creating the rendering tree and layout for the webpage

Once the DOM has been created, and the CSS parser has completed parsing the CSS file, the rendering engine uses a style engine to join both CSSOM and DOM. This creates a rendering tree which contains information about the structure and style of the webpage, which is to be rendered. The rendering tree only consists of visible nodes and does not have any nodes that are invisible to the user on the screen.

After creating the rendering tree, the rendering engine starts the layout process. This process takes into consideration the resolution of the screen and how each element should be placed on the device. It also calculates the size of each element that is going to be rendered on the screen and its relative position to other elements.

Now that the rendering engine has all the information about the webpage in a format that our system can understand, we can begin to render the page on the browser

Also read: How does public-key encryption work? Does it make the internet safer?

Painting the canvas and compositing the webpage on the screen

Once the rendering engine has completed the layout process, it needs to paint each pixel on the screen according to the layout, which was created using the rendering tree. This process is known as rasterization, which is the process of painting the screen. Most browsers use the CPU to perform this task, but as it is a process that involves repetitive processing, it can be offloaded to the GPU for getting better results.

The painting operation occurs in a layered format, and the rendering engine creates multiple layers of elements to create the webpage. This layered structure helps the browser to make changes faster when the user interacts with the webpage.

Once all the layers have been created, the rendering engine sends this information to the user interface, displaying the webpage on the screen. This process is known as compositing the webpage and is the last step performed by the rendering engine

Critical Rendering Path of a website

This process of creating the webpage from bits of data is known as the critical rendering path and is the main determinant for the performance of any webpage you visit on the Internet.

Now that the rendering engine has rendered the website in the browser you might be wondering that we did not use Javascript anywhere. This is because the Javascript is an independent entity, which is responsible for making changes to the DOM structure that adds interactivity to the website

Also read:10 Firefox Add-ons for increased privacy and a better browsing experience

Adding Interactivity to websites with Javascript

After the rendering engine has completed rendering the website on the user interface, the user can see the website, but it is not interactive yet. What this means is that if there is a button on the webpage, which shows a prompt to the user it will not work unit the Javascript comes into the picture.

Javascript is also capable of making changes to the DOM structure, which was created by the rendering engine and even make new DOM nodes and connect them to the DOM structure. This Javascript code is run by a virtual machine in the browser known as the Javascript engine.

The Javascript engine and the DOM structure do not share the same memory and are independent entities. That said the Javascript engine can interact with the DOM structure and run when a certain event occurs on the page. This differentiation between the two spaces helps the browser to render pages using the Javascript engine and display them when an event occurs.

Javascript is at the heart of every website you use and is responsible for processing user inputs and sending them to the remote server, which runs the website. The scripting nature of Javascript is what makes browsers extremely versatile, enabling websites to perform in a similar fashion on both smartphone and desktops as the browser can interpret the Javascript code in the Javascript engine.

Back in the day when the Internet was invented, all browsers did was display webpages and there was not a lot of Javascript involved. The remote server did most of the processing, and the Javascript engine did not do a lot on a webpage. Due to this, a lot of information had to travel between the server and browser and such an architecture was fine for the Internet when pages were not so complex and interactive.

That said the modern web could not run on the same architecture as it would make websites really slow. Therefore both the browser and the remote server have to work symbiotically to offer the best user experience. What this means is that the browser is no longer responsible for only showing web pages but also for processing lots of data and the Javascript engine does all this.

Also read: What is AES Encryption? How does it work?

Decoding the Javascript engine

Javascript made its debut in 1996 and was created by Brendan Eich in just 10 days. It was part of Netscape Navigator version 3 and was created as a scripting language which could be interpreted in the browser itself.

As Javascript was created as a language that could be processed by an Interpreter in a web browser, it did not create machine code to run on a CPU, making the language extremely versatile.

That said, this versatile nature of Javascript had a trade-off; slow performance. To fix this problem, JIT compilers came to Javascript, making them really fast. The use of JIT compilers made Javascript so fast that it runs on server hosting your websites.

Now that we are familiar with the role of Javascript in running a website, we can get into the nitty-gritty of how the Javascript engine works.

How Javascript engine works?

Just like the networking layer fetches HTML and CSS in the form of bytes for the rendering engine, it also fetches Javascript code and gives it to the Javascript engine.

Once the engine has the Javascript code, it sends it to a parser that creates an Abstract Syntax Tree (AST). This tree is a logical representation of the Javascript code, which can be run by a compiler. The compiler converts the tree into an intermediate language (bytecode), which can be run by the interpreter in a line by line order.

This execution of Javascript is used when the code in the script does not perform repetitive tasks (like looping). If there is extensive looping in the Javascript code, then the engine tries to optimise this code and run it on the CPU of a device. As the code runs on the CPU of a machine, it runs much faster when compared to the interpreted version.

To create the machine code, the Javascript engine uses an optimising compiler. This compiler takes the byte code generated by the compiler and converts it into device-specific machine code.

Once the engine has the optimised machine code, it can run the script at blazing fast speeds using both the CPU and the Javascript interpreter.

Looking into the future

Although browsers are super powerful now, innovations are always coming up to speed up the browsing experience even further. One such innovation is webassembly which is being used with Javascript to make code execution even faster by using assembly-level code.

Not only this, browsers are catching up with the advancements in machine learning and artificial intelligence. With libraries like Tensorflow coming to Javascript only means that browsers are bound to get smarter in the future; further enhancing the user experience they offer.

Also read: What is a Router and how does it work?

Nischay Khanna

A tech enthusiast, driven by curiosity. A bibliophile who loves to travel. An Engineering graduate who loves to code and write about new technologies. Can't sustain without coffee. You can contact Nischay via email:

Exit mobile version