What actually happens when you type a URL and hit Enter?

URL Request To Page Render

Every engineer has seen this question in interviews. Most give a vague hand-wavy answer about "DNS and TCP." Here's what's actually going on — every layer, every handshake, no padding.

The browser's first move: parse and cache check

Before a single packet leaves your machine, the browser parses the URL into its parts: protocol (https), host (example.com), path (/products), and any query params. Then it goes hunting for a cached answer — in order:

Browser memory cache — did you visit this 2 seconds ago?
OS DNS cache — nscd or the Windows DNS client resolver cache
Hosts file — /etc/hosts on Linux/Mac, C:\Windows\System32\drivers\etc\hosts on Windows

Only on a cache miss does it go to the network. For a first-ever visit to a site, you're going all the way down.

DNS: mapping a name to an IP

DNS is a distributed, hierarchical key-value store. Your machine doesn't know where example.com lives — it asks a recursive resolver (usually your ISP's, or 8.8.8.8 if you've configured Google's). The resolver then does the heavy lifting:

Asks a root nameserver (a.root-servers.net, etc.) — "who owns .com?"
Root returns the address of the TLD nameserver for .com
Resolver asks the TLD NS — "who owns example.com?"
TLD NS returns the authoritative nameserver for example.com
Authoritative NS returns the actual A record — an IPv4 address like 93.184.216.34

The entire chain is cached at each layer with a TTL. A low TTL (60s) means DNS changes propagate fast but every request pays the lookup cost. A high TTL (86400s = 24h) is cheap on queries but slow to update.

On a warm cache, this whole dance is skipped. On a cold one, it adds 20–120ms.

TCP: the connection contract

You have an IP. Now you need a reliable byte stream. TCP handles that with a three-way handshake:

Client → Server:   SYN (seq=100)
Server → Client:   SYN-ACK (seq=200, ack=101)
Client → Server:   ACK (ack=201)

This round trip establishes sequence numbers on both sides so every segment can be tracked, reordered, and retransmitted if lost. The handshake alone costs one RTT — on a cross-continental connection that's 150–200ms before you've sent a single byte of HTTP.

HTTP/2 and HTTP/3 reduce this cost. HTTP/3 over QUIC skips the TCP handshake entirely and combines connection establishment with TLS in a single round trip.

TLS: encrypting the channel

For HTTPS (which is everything now), a TLS handshake happens on top of TCP. TLS 1.3, which is standard today, takes one round trip:

ClientHello — client sends supported cipher suites and a key share
ServerHello + Certificate + Finished — server picks cipher, proves identity with its cert, sends encrypted Finished
Client Finished — confirms and the encrypted channel is open

Older TLS 1.2 needed two round trips. TLS 1.3 also supports 0-RTT resumption — if you've connected before, you can send application data in the very first packet. Slight security trade-off (replay attacks), but most CDNs use it for performance.

Certificate validation involves checking the cert chain up to a trusted CA root, and optionally querying OCSP for revocation status. Certificate transparency logs are checked too — Chrome requires it.

HTTP request: what you actually send

With an encrypted TCP connection open, the browser sends an HTTP request:

GET /products HTTP/1.1

Host: example.com

Connection: keep-alive

Accept: text/html,application/xhtml+xml

Accept-Encoding: gzip, br

Cookie: session=abc123; pref=dark

Cache-Control: max-age=0

A few things worth noting here. Host is mandatory in HTTP/1.1 — it's how virtual hosting works (one server, many domains). Accept-Encoding: br means the client supports Brotli compression, which is 15–25% better than gzip on text. And cookies travel with every request to the same origin — a major reason third-party cookies are a security concern.

With HTTP/2, this request is binary-framed and multiplexed — multiple requests can be in-flight over a single connection with no head-of-line blocking at the HTTP layer.

Server-side: what happens in the black box

The request hits the server. In a typical production setup, it first lands at a load balancer (nginx, HAProxy, AWS ALB) which distributes traffic across a fleet of app servers. The load balancer also terminates TLS so backend servers don't carry that overhead.

The app server (Node.js, Rails, Django, Go — whatever) receives the HTTP request and runs your business logic. That usually means:

Validating session/auth (JWT check, cookie lookup)
Querying a database (Postgres, MySQL) for dynamic data
Hitting a cache layer (Redis, Memcached) to avoid the DB entirely on hot paths
Calling downstream microservices or external APIs

A cache hit on Redis for a hot endpoint can take the DB out of the path entirely — going from 50ms to 1ms for data retrieval. This is where the biggest performance wins usually live.

HTTP response: what comes back

The server returns:

HTTP/1.1 200 OK

Content-Type: text/html; charset=UTF-8

Content-Encoding: br

Cache-Control: public, max-age=3600

ETag: "abc123"

Strict-Transport-Security: max-age=31536000



<html>...

Key headers here: Cache-Control tells the browser how long to cache this. ETag lets the browser send a conditional If-None-Match on the next request — if unchanged, the server returns a 304 Not Modified with no body. HSTS tells the browser to never connect over plain HTTP again. The body is typically Brotli-compressed HTML.

Browser rendering: turning bytes into pixels

The HTML arrives and the browser's rendering pipeline kicks in:

HTML Parser → DOM — bytes into a tree of nodes
CSS Parser → CSSOM — stylesheets into a cascade-resolved rule tree
Render tree — DOM + CSSOM merged; only visible nodes
Layout (Reflow) — every node gets a position and size on the viewport
Paint — rasterize each layer into pixel bitmaps
Compositing — GPU assembles layers, handles transforms and opacity

JavaScript that touches the DOM forces a synchronous pause in parsing, which is why <script defer> and <script async> matter. CSS in <link> blocks rendering until the stylesheet is downloaded and parsed — render-blocking by design.

First Contentful Paint (FCP) is when the first pixel appears. Largest Contentful Paint (LCP) is the main performance metric for real user experience. Both are dominated by how fast the server responds and how heavy the critical rendering path is.

Numbers that matter

Phase	Typical time (warm cache, same continent)
DNS lookup	0ms (cached) to 100ms
TCP handshake	1× RTT (10–150ms)
TLS handshake (1.3)	1× RTT
HTTP request + response	1× RTT + server time
Browser rendering	50–300ms
Total TTFB	~80–400ms

A fast site hits FCP under 1 second end-to-end. CDN edge nodes sitting close to users, HTTP/3, aggressive caching, and a lean rendering path are the main levers.

Where things go wrong (common failure points)

DNS TTL too low — every user pays full DNS lookup time on every visit
No HSTS preloading — first request goes over HTTP, redirects to HTTPS, wastes a round trip
Render-blocking JS — sync scripts halt HTML parsing; defer everything you can
No connection reuse — HTTP/1.1 without keep-alive opens a new TCP+TLS per request
Uncached DB queries on hot paths — Redis exists for a reason
Large uncompressed assets — Brotli your text, serve WebP, lazy-load below-the-fold images

Key takeaways

The journey from URL to rendered page is a layered protocol stack — DNS, TCP, TLS, HTTP, server logic, rendering engine — each adding latency that compounds. The systems that feel instant have typically eliminated every redundant round trip: DNS cached at the edge, TLS 1.3 with 0-RTT, HTTP/3, CDN-cached assets, aggressive browser caching, and server-side Redis for hot data.

Every millisecond of TTFB you cut translates directly to user perception. On mobile over 4G with 60ms RTT, a two-round-trip savings is 120ms — the difference between "fast" and "noticeable."

ArchFoundry