How does global load balancing work?

Question

How does a highly scaled company serve users on the opposite end of the world?

Take Facebook for example (headquartered in California). If someone hit their endpoints from a place like South Africa, how are they routed to the nearest webservers?

I thought the load balancer would do this, but that doesn't make sense to me because how do we make sure the load balancer isn't 400ms rt away as well?

The answer is less about system architecture than about various bits of network technology – such as resolving the same domain name to different IP addresses in different locations, or routing the same IP address to different servers (a kind of [Anycast](https://en.wikipedia.org/wiki/Anycast)). — amon, Sep 03 '17 at 10:35

score 3 · Accepted Answer · answered Sep 03 '17 at 12:14

3

This is essentially how CDNs work. When a person in South Korea requests a given resource, the DNS server replies with an IP address of a reverse proxy located in Seoul. Another user from California will get a different IP address which may point to a datacenter in Oregon, and a user from Spain may get another IP which leads to a datacenter in Frankfurt.

While CDNs serve static content, things get more complicated with dynamic content, where a user from Korea will want to get content generated by his friend from Spain, while expecting the HTTP request to take only a few milliseconds.

One of the techniques is to replicate dynamic content in multiple datacenters across the world; a change by a user from California may be stored on only on the servers in Oregon, but also synchronized in Seoul and Frankfurt.

Another approach is to keep the content in a few geographically close data centers (say in North Virginia and in Oregon), but also proxy the content through the data centers which are closer to the users. In other words, the first person from South Korea who requests a given piece of data will have to wait for a few hundreds of milliseconds for the Seoul's data center to get the data from US servers, but further requests from other Korean users will simply load the data cached by Seoul's servers.

answered Sep 03 '17 at 12:14

Arseni Mourzenko

134,780
31
343
513

Exactly what I was looking for! Thanks! Could you help me understand the tradeoffs between those two techniques you mentioned? – calmthatwombat Sep 04 '17 at 00:32
The two approaches are mostly equivalent. The benefit of the replication approach is that data change propagation occurs in real-time; global servers are sent the changes as they happen. The downside of the replication approach is that it requires the content servers to be able to detect and propagate changes to the CDNs. Configuring a replication approach requires installing/running special software on the content servers, and setup is probably different for every OS. The proxy approach is turnkey: A) Point static content domain at CDN and B) Point CDN at content machine. – Brian Sep 05 '17 at 14:18
Generally, most sites tend to favor the proxy approach, since it's incredibly turnkey. The downsides are relatively mild. No realtime propagation is mostly OK since new content propagates almost in real time (the first request is slower). Longer cache times can be mitigated by providing support to explicitly invalidate the cache, or at the server level via cache-busting query parameters. The latter is not turn-key, but engineers can implement it later (or can invalidate caches manually, or can set cache times to under an hour). – Brian Sep 05 '17 at 14:21
can this geo-aware DNS service is offered by anyone who is not a CDN? like, I want to host my own content and I don't want to distribute it via CDN, just want to have multiple sites across the globe and have the DNS always return the closest IP to the request – perrohunter Sep 14 '21 at 03:31

How does global load balancing work?

1 Answers1