As elegant as IP addresses may be, human beings do not enjoy having to recall long strings of numbers. One can imagine how unpleasant the Internet would be if you had to remember IP addresses instead of domains. Rather than google.com, you’d have to type 18.104.22.168. If you had to type in 22.214.171.124 to visit Facebook, it is quite likely that social networking would be a less popular pastime.
Even as far back as the days of ARPANET, researchers assigned domain names to IP addresses. In those early days, the number of Internet hosts was small, so a list of a few hundred domain and IP addresses could be downloaded as needed from the Stanford Research Institute (now SRI International) as a hosts file. Those key-value pairs of domain names and IP addresses allowed people to use the domain name rather than the IP address.
As the number of computers on the Internet grew, this hosts file had to be replaced with a better, more scalable, and distributed system. This system is called the Domain Name System (DNS).
DNS is one of the core systems that make an easy-to-use Internet possible (DNS is used for email as well). The DNS system has another benefit besides ease of use. By separating the domain name of a server from its IP location, a site can move to a different location without changing its name. This means that sites and email systems can move to larger and more powerful facilities without disrupting service.
Since the entire request-response cycle can take less than a second, it is easy to forget that DNS requests are happening in all your web and email applications. Awareness and understanding of the DNS system is essential for success in developing, securing, deploying, troubleshooting, and maintaining web systems.
A domain name can be broken down into several parts. They represent a hierarchy, with the rightmost parts being closest to the root at the “top” of the Internet naming hierarchy. All domain names have at least a top-level domain (TLD) name and a second-level domain (SLD) name. Most websites also maintain a third-level WWW subdomain and perhaps others.
The rightmost portion of the domain name (to the right of the rightmost period) is called the top-level domain. For the top level of a domain, we are limited to two broad categories, plus a third reserved for other use. They are:
Generic top-level domain (gTLD)
Unrestricted. TLDs include .com, .net, .org, and .info.
Sponsored. TLDs including .gov, .mil, .edu, and others. These domains can have requirements for ownership and thus new second-level domains must have permission from the sponsor before acquiring a new address.
New. From January to May of 2012, companies and individuals could submit applications for new TLDs. TLD application results were announced in June 2012, and include a wide range of both contested and single applicant domains. These include corporate ones like .apple, .google, and .macdonalds, and contested ones like .buy, .news, and .music.13
Country code top-level domain (ccTLD)
TLDs include .us, .ca, .uk, and .au. At the time of writing, there were 252 codes registered. These codes are under the control of the countries which they represent, which is why each is administered differently. In the United Kingdom, for example, commercial entities and businesses must register subdomains to co.uk rather than second-level domains directly. In Canada .ca domains can be obtained by any person, company, or organization living or doing business in Canada. Other countries have peculiar extensions with commercial viability (such as .tv for Tuvalu) and have begun allowing unrestricted use to generate revenue.
Since some nations use nonwestern characters in their native languages, the concept of the internationalized top-level domain name (IDN) has also been tested with great success in recent years. Some IDNs include Greek, Japanese, and Arabic domains (among others) which have test domains at http://παράδειγμα.δοκιμή, http://例え.テスト, and http://لاثم. رابتخإ, respectively.
The domain .arpa was the first assigned top-level domain. It is still assigned and used for reverse DNS lookups (i.e., finding the domain name of an IP address).
In a domain like testsitecom, the “.com” is the top-level domain and "testsite" is called the second-level domain. Normally it is the second-level domains that one registers.
There are few restrictions on second-level domains aside from those imposed by the registrar (defined in the next section below). Except for internationalized domain names, we are restricted to the characters A-Z, 0-9, and the “-” character. Since domain names are case-insensitive characters, a-z can also be used interchangeably.
The owner of a second-level domain can elect to have subdomains if they so choose, in which case those subdomains are prepended to the base hostname. For example, we can create testing.testsite.com as a domain name, where "testing" is the subdomain.
As we have seen, domain names provide a human-friendly way to identify computers on the Internet. How then are domain names assigned? Special organizations or companies called domain name registrars manage the registration of domain names. These domain name registrars are given permission to do so by the appropriate generic top-level domain (gTLD) registry and/or a country code top-level domain (ccTLD) registry.
In the 1990s, a single company (Networks Solutions Inc.) handled the com, net, and org registries. By 1999, the name registration system changed to a market system in which multiple companies could compete in the domain name registration business. A single organization - the nonprofit Internet Corporation for Assigned Names and Numbers (ICANN) - still oversees the management of top-level domains, accredits registrars, and coordinates other aspects of DNS. At the time of writing this chapter, there were almost 1000 different ICANN-accredited registrars worldwide.
While domain names are certainly an easier way for users to reference a website, eventually your browser needs to know the IP address of the website in order to request any resources from it. DNS provides a mechanism for software to discover this numeric IP address. This process is referred to here as address resolution.
When you request a domain name, a computer called a domain name server will return the IP address for that domain. With that IP address, the browser can then make a request for a resource from the web server for that domain.
DNS is sometimes referred to as a distributed database system of name servers. Each server in this system can answer, or look for the answer to questions about domains, caching results along the way. From a client’s perspective, this is like a phonebook, mapping a unique name to a number. Let’s examine the address resolution process in more detail.
The resolution process starts at the user’s 1 computer. When the domain wwww.testsite.com is requested (perhaps by clicking a link or typing in a URL), the browser will begin by seeing if it already has the IP address for the domain in its cache.
If the browser doesn’t know the IP address for the requested site, it will delegate the task to the DNS resolver, a software agent that is part of the operating system. The DNS resolver also keeps a cache of frequently requested domains.
Otherwise, it must ask for outside help, which in this case is a nearby DNS server, a special server that processes DNS requests. This might be a computer at your Internet service provider (ISP) or at your university or corporate IT department. The address of this local DNS server is usually stored in the network settings of your computer’s operating system. This server keeps a more substantial cache of domain name/IP address pairs. If the requested domain is in its cache, then the process jumps to step 11.
If the local DNS server 2 doesn’t have the IP address for the domainin its cache, then it must ask other DNS servers for the answer. Thankfully, the domain system has a great deal of redundancy built into it. This means that in general there are many servers that have the answers for any given DNS request. This redundancy exists not only at the local level but at the global level as well.
If the local DNS server cannot find the answer to the request from an alternate DNS server,it must get it from the appropriate top-level domain (TLD) name server. For testsite.com this is .com. Our local DNS server might already have a list of the addresses of the appropriate TLD name servers in its cache.
If the local DNS server does not already know the address of the requested TLD server (for instance, when the local DNS server is first starting up 9it 19 won’t have this information), then it must ask a root name server for that information. The DNS root name servers store the addresses of TLD name 10 20 servers. IANA (Internet Assigned Numbers Authority) authorizes 13 root servers, so all root requests will go to one of these 13 roots. In practice, these 13 machines are mirrored and distributed around the world (see http:// www.root-servers.org/ for an interactive illustration of the current root servers); at the time of writing there are a total of 350 root server machines. With the creation of new 2 commercial top-level domains in 2012, approximately 2000 or so new TLDs will be coming online; this will create a heavier load on these root name servers.
After receiving the address of the TLD name server for the requested domain, the local DNS server can now ask the TLD name server for the address of the requested domain. As part of the domain registration process, the address of the domain’s DNS servers are sent to the TLD name servers, so this is the information that is returned to the local DNS server
The user’s local DNS server can now ask the DNS server (also called a second-level name server) for the requested domain (www.testsite.com); it should receive the correct IP address of the web server for that domain. This address will be stored in its own cache so that future requests for this domain will be speedier. That IP address can finally be returned to the DNS resolver in the requesting computer.
The browser will eventually receive the correct IP address for the requested domain. Note: If the local DNS server was unable to find the IP address, it would return a failed response, which in turn would cause the browser to display an error message.
Now that it knows the desired IP address, the browser can finally send out the request to the web server, which should result in the web server responding with the requested resource.
This process may seem overly complicated, but in practice it happens very quickly because DNS servers cache results. Once the server resolves testsite.com, subsequent requests for resources on testsite.com will be faster, since we can use the locally stored answer for the IP address rather than have to start over again at the root servers.
To facilitate system-wide caching, all DNS records contain a time to live (TTL) field, recommending how long to cache the result before requerying the name server. Although this mechanism improves the efficiency and response time of the DNS system, it has a consequence of delaying propagation of changes throughout all servers. This is why administrators, after updating a DNS entry, must wait for propagation to all client ISP caches.
I hope that this thorough explanation of the domain name system, domain naming levels, registraion and the address resolution process has answered any questions that you may have on how the domain name system works.
Published on Tue 27 March 2012 by Adi Wagstaff in Networking with tag(s): dns