What Happens When You Type a URL: How Browsers Work
Technology

What Happens When You Type a URL: How Browsers Work

M
Marcus Thorne · ·8 min read

You type https://news.ycombinator.com and press Enter. About half a second later, a fully formatted page appears. This feels like one thing happening, but it’s actually a precisely orchestrated sequence of network requests, cryptographic handshakes, code parsing, and rendering calculations.

Understanding this sequence gives you a useful mental model for web performance, debugging, and how the web actually works — not as magic, but as a comprehensible system.

Step 1: DNS Resolution

Before your browser can contact any server, it needs to know that server’s IP address. You typed news.ycombinator.com — a human-readable name. The internet routes data by IP address. The translation happens through DNS (Domain Name System).

Your browser checks its own cache first: have you visited this site recently? If so, the IP is saved locally and this step is instant. If not:

  1. Your browser asks your operating system’s DNS resolver
  2. The resolver asks your configured DNS server (usually your ISP’s or Google’s 8.8.8.8)
  3. If that server doesn’t know the answer, it queries the DNS hierarchy — root servers, then the .com nameservers, then Hacker News’s own nameservers
  4. The IP address comes back and is cached for future requests

This whole process typically takes 1-100 milliseconds for uncached lookups. For a returning visitor with a cached result: zero milliseconds.

Step 2: TCP Connection

With an IP address in hand, your browser establishes a connection to the server using TCP (Transmission Control Protocol). TCP is the reliable transport layer of the internet — it guarantees packets arrive, arrive in order, and requests retransmission when they don’t.

Establishing a TCP connection requires a three-way handshake:

  1. Your browser sends a SYN packet to the server (“I’d like to connect”)
  2. The server responds with a SYN-ACK (“Acknowledged, ready to connect”)
  3. Your browser sends an ACK (“Acknowledged, connected”)

This handshake adds one round-trip time (RTT) of latency before any data is sent. For a server 50ms away, this costs 50ms just to establish the connection.

Step 3: TLS Handshake

Since you typed https://, the connection needs to be encrypted. After the TCP handshake, a TLS handshake happens:

  1. Your browser sends its supported TLS versions and cipher suites
  2. The server responds with its certificate (proving its identity) and its chosen cipher suite
  3. Your browser verifies the certificate against trusted Certificate Authorities
  4. Both sides exchange cryptographic keys
  5. Encrypted communication begins

In TLS 1.3 (the current standard), this handshake takes only one round trip — a significant improvement over older versions that required two. Still, by this point you’ve used 2-3 round trips before a single byte of the actual webpage has been requested.

This is why reducing physical distance to servers matters so much. A server 200ms away costs 400-600ms just in handshakes. A CDN edge node 10ms away costs 20-30ms — 15-20x faster.

Step 4: The HTTP Request

Now the browser sends an HTTP GET request for the URL:

GET / HTTP/2
Host: news.ycombinator.com
Accept: text/html,application/xhtml+xml
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.9
Cookie: [any stored cookies for this domain]
User-Agent: Mozilla/5.0 [browser info]

HTTP/2 (used by most modern sites) adds multiplexing — multiple requests can be sent over the same connection simultaneously without waiting for responses. This is a major improvement over HTTP/1.1, which could only handle one request at a time per connection.

HTTP/3 (newer, still rolling out) goes further, replacing TCP with a protocol that handles packet loss more gracefully and eliminates TCP’s head-of-line blocking.

Step 5: The Server Responds

The server receives the request and sends back a response. For many modern sites, “the server” isn’t a single machine — it’s a load balancer distributing requests across many servers, backed by caches, databases, and other services.

The response includes:

  • Status code: 200 (OK), 301 (redirect), 404 (not found), 500 (server error), etc.
  • Headers: Content type, caching instructions, security policies
  • Body: The HTML document
HTTP/2 200 OK
Content-Type: text/html; charset=utf-8
Content-Encoding: gzip
Cache-Control: private, max-age=0

The HTML body is typically compressed (gzip or Brotli), shrinking it by 60-80% for transmission.

Step 6: Parsing HTML

The browser starts reading the HTML before it’s fully downloaded — a key optimization called streaming parsing. As bytes arrive, the browser builds the DOM (Document Object Model): a tree structure representing the page.

<html>
  <head>
    <title>Hacker News</title>
    <link rel="stylesheet" href="/news.css">
  </head>
  <body>
    <table>
      <tr class="athing">...</tr>
    </table>
  </body>
</html>

Becomes a tree:

Document
└── html
    ├── head
    │   ├── title: "Hacker News"
    │   └── link: news.css
    └── body
        └── table
            └── tr.athing

As the parser encounters external resources — stylesheets, scripts, images — it requests them immediately, running many downloads in parallel.

Step 7: Render-Blocking Resources

This is where performance problems often live.

CSS is render-blocking: the browser won’t render any content until it has downloaded and parsed all CSS files. It needs the styles to know what anything looks like. If your stylesheet is 500KB and on a slow server, the user sees a blank white screen until it loads.

JavaScript (by default) is also render-blocking: when the parser encounters a <script> tag, it stops parsing HTML, downloads the script, executes it, and only then continues. A 200KB JavaScript file on the critical path delays everything behind it.

Modern best practices attack this:

  • <link rel="preload"> — tells the browser about critical resources early, so it starts downloading before the parser encounters them
  • async attribute on scripts — download doesn’t block parsing; executes when ready
  • defer attribute on scripts — download doesn’t block parsing; executes after HTML is parsed
  • CSS bundling and minification — fewer requests, smaller files

Step 8: CSSOM Construction

While HTML is parsed into the DOM, CSS is parsed into the CSSOM (CSS Object Model) — a parallel tree of style rules.

The browser needs both the DOM and CSSOM before it can display anything. Only when both are available does it build the Render Tree: a combination that represents which elements need to be displayed and what they look like.

Elements with display: none are excluded from the Render Tree — they’re in the DOM but don’t affect rendering.

Step 9: Layout

With the Render Tree, the browser performs layout (also called reflow): calculating the exact position and size of every element on the screen.

This is more complex than it sounds. Elements affect each other — a wide image can push text down. Flexbox and Grid layouts require solving constraint equations. Text needs to be measured. Layout traverses the entire render tree, and for complex pages, it can be a significant computation.

Layout is triggered again whenever anything changes the size or position of elements: resizing the window, adding elements via JavaScript, changing font sizes.

Step 10: Painting and Compositing

Painting converts the layout tree into actual pixels: drawing text, colors, borders, shadows, images. This happens in layers — elements that can move independently (like fixed headers, animations, or position: fixed elements) get their own compositing layer.

Compositing combines these layers in the GPU and sends the final image to the display.

Modern browsers try to handle animations and scrolling entirely on the GPU, without re-running layout or paint. CSS properties like transform and opacity can be animated this way — they’re handled entirely by the compositor. This is why animating transform: translateX() is smooth while animating left or margin causes jank: the latter triggers layout recalculation every frame.

The Full Timeline

URL entered
  → DNS lookup (0-100ms)
  → TCP handshake (1 RTT)
  → TLS handshake (1-2 RTTs)
  → HTTP request sent
  → Server processes request
  → First byte arrives (TTFB)
  → HTML parsed, resources discovered
  → CSS + JS downloaded in parallel
  → DOM + CSSOM complete
  → Layout → Paint → Composite
  → Page visible to user

Performance optimization is essentially about shrinking each of these steps: fewer DNS lookups (keep connections alive), fewer round trips (HTTP/2, CDNs), less to download (compression, code splitting), less to render (efficient CSS, avoiding layout thrash).


The next time a page feels slow, you now have a map of where to look. The next time someone asks why CDNs matter, or why render-blocking CSS is a problem, or why a mobile user in another country has a different experience — you know the answer. Every millisecond is earned or lost at a specific, identifiable step in this chain.

M

Written by Marcus Thorne

Software analysis and cybersecurity tips

A former software engineer, Marcus transitioned into tech journalism to explain complex digital concepts in simple terms.

You Might Also Like