Traffic Cops of the Internet
When you pursue Horizontal Scaling, you need a way to distribute incoming traffic across your fleet of servers. A Load Balancer (LB) sits between your users and your servers, acting as the single entry point. Users connect to www.example.com (the LB''s IP), and the LB transparently routes their packets to healthy backend machines.
Real-World Example: GitHub
GitHub receives billions of Git operations per day . Behind github.com , an HAProxy load balancer distributes requests across hundreds of backend servers. When GitHub''s engineering team deploys new code, they use the load balancer to gradually shift traffic from old servers to new ones—if the new code has a bug, only 1% of users are affected, and the rollback is instant. This pattern is called a canary deployment .
L4 vs L7 Load Balancing
Layer 4 (Transport Layer)
L4 load balancers route based on IP addresses and TCP/UDP ports. They''re fast because they don''t inspect the HTTP payload—they just forward raw packets. Think of them as a highway toll booth that routes cars to different lanes without checking what''s inside.
• Example: AWS Network Load Balancer (NLB) handles millions of requests per second with single-digit millisecond latency.
Layer 7 (Application Layer)
L7 load balancers understand HTTP and can route based on URL paths, headers, cookies, and even request bodies. This is far more powerful but adds processing overhead.
• Example: NGINX can route /api/* requests to your backend servers and /static/* requests to a CDN origin—all from a single entry point. Shopify uses this pattern to separate their storefront traffic from admin dashboard traffic.
Routing Algorithms
How does the Load Balancer decide which server receives the next request?
- Round Robin
The simplest approach: rotate through…
Preview this lesson for free
Sign in to continue reading the full post.