DNS: THE DOMAIN NAME SYSTEM
Table Of Contents
Introduction
The Internet Protocol addresss is a 32- bit integer. If somebody wants to send
a message it is necessary to include the destination address, but people prefer
to assign machines pronounceable, easily remembered names (host names).
For this reason the Domain Name System is used. These logical names also allow independence from knowing the physical location of a host. A host may be moved to a different network, while the users continue to use the same logical name.
The Domain Name System (DNS) is a distributed database used by TCP/IP
applications to map between hostnames and IP addresses, and to provide
electronic mail routing information. Each site (university department,
campus,company, or department within a company, for example) maintains
its own database of information and runs a server program that other
systems across the Internet can query. The DNS provides the protocol which
allows clients and servers to communicate with each other.
The system accesses the DNS through a resolver. The resolver gets the hostname
and returns the IP address or gets an IP address (fig.1) and looks up a hostname.
As we can see in fig.1 the resolver returns the IP address before asking
the TCP to open a connection or sending a datagram using UDP.
DNS Organization
The domain name system uses a hierarchical naming scheme known as domain
names,which is similar to the Unix filesystem tree. The root of the DNS tree
is a special node with a null label. The name of each node (except root) has
to be up to 63 characters.The domain name of any node in the tree is
the list of labels, starting at that node, working up to the root, using
a period ("dot") to separate the labels (individual sections of a name might
represent sites or a group, but the domain system simply calls each section a
label ). The difference between the Unix filesystem and the tree of the DNS is that
in the DNS we start on the ground and "go up" till the root. Writing them
in this order makes it possible to compress messages that contain multiple
domain names. Thus, the domain name "tau.ac.il" contains three labels:
"tau", "ac", and "il". Any suffix of a label in a domain name is also
called a domain. In the above example the lowest level domain is "tau.ac.il"
(the domain name for the Tel-Aviv University Academic organization in
Israel), the second level domain is "ac.il" (the domain name for Academic
organizations of Israel), and the top level domain (for this name) is "il"
(the domain name for Israel). The node il is the second level node
(after root) (Fig.2)
Every node in the tree must have a unique domain name, but the same label
can be used at different points in the tree.
The top-level domains are divided into three areas:
- 1. arpa is a special domain used for address-to-name mapping.
- 2. The seven 3-character domain names ( generic (organizational) domains).
- 3. The 2-character domains are based on the country codes. These are called the country (the geographical) domains.
The seven generic domains are depicted in the fig.3 :
The Internet scheme can accomodate a wide variety of organizations,
and allows each group to choose between geographical or organizational
naming hierarchies. Most sites follow the Internet scheme so they can
attach their TCP/IP installations to the connected Internet without changing
names.
The zone is a subtree of the DNS that is administered separately.
A common zone is a second-level domain, "ac.il" for example. Thus a lot of
second-level domains divide their zone into smaller zones.
Whenever a new system is installed in a zone, the DNS administrator for the
zone allocates a name and an IP address for the new system and enters these
into the name servers database. A name server is said to have authority
forone zone or multiple zones. Often, server software executes on a
dedicated processor, and this computing machine is called the name Server.
The person responsible for a zone must provide a primary name server for
that zone and one or more secondary name servers. The main difference
between a primary and a secondary is that the primary loads all the
information for the zone from disk files, while the secondaries obtain
all the information from the primary. When a secondary obtains the
information from its primary it is called a zone transfer.
When a new host is added to the zone, the administrator adds the
appropriate information (name and IP address) to a disk file on the system
running the primary. The primary name server is then notified to reread
its configuration files. The secondaries query the primary on regular
basis (normally every 3 hours) and if the primary contains newer data,
the secondary obtains the new data using a zone transfer.
If the name server does not contain the information requested, it must
contact another name server. Not every server, however, knows how to contact
every other server. Instead, every name server must know how to contact the
root name servers. The root servers then know the name and location
(i.e. IP address) of each authoritative name servers for all the
second-level domains. There are six root servers in the world and every
primary name server has to know the address of one of root server.
The fig. 2 we can depict the tree of servers fig.4.
In practice, the organization often collects information from all of
their sub-zones into a single server. Thus we can depict fig.5 which is
more realistic than fig.4.
We have to say that the tree in fig.5 shows how a given server can contact
other servers only. This tree does not indicate physical network connection.
Servers may be located at arbitrary locations on the network. Therefore,
the tree of servers is a logical conection between servers, which uses the
Internet for communication.
<hl>DNS Caching</hl>
A fundamental property of the DNS is caching. That is, when a name server
receives information about a mapping, it caches that information.
Thus a later query for the same mapping can use the cached result, and not
result in additional queries to other servers. The DNS uses the caching for
optimizing search cost.
How does it work?
Every server has a cache for recently used names as well as a record of
where the maping information for that name was obtained. When a client ask
the server to resolve certain name the server does as follows:
- Check if it has authority for the name. If yes, the server does not need caching information.
- if not, the server checks its cache whether the name has been resolved recently. if yes, the server reports the caching information to its clients.
We can examine the cache when the server cashed the information once, but did not
change it. Since information about a particular name can change, the server
may have incorrect information in its caching table.
The Time to Live (TTL) value is used to decide when to age information.
Whenever an authority responds to a request, it includes a TTL value in the
response which specifies how long it guarantees the binding to remain.
DNS MESSAGE FORMAT
When the user wants to send a message, it invokes an application program and
supplies the name of a machine with which the application must communicate.
The application program must find the machines IP address. It passes the
domain name to a local resolver (L.R.) and requests an IP address. The local
resolver checks its cache and:
- If the L.R. has an answer, it returns the answer.
- If the L.R. has not one, it sends the message to the server.
The server then returns a similar message that contains the answer to the
questions for which the server has bindings. If the server can not answer, it
sends responsive information about other servers that the client can
contact.
Fig.6 shows the DNS message format.
Explanation of Fig.6:
- The IDENTIFICATION is set by the client and returned by the server.
- The 16-bit PARAMETER consists of:
- 0-th bit field -qr: 0 means the message is a query,1 means it is a response.
- 1-4 bit fields - OPCODE:
- 0 -is a normal value (Standard query).
- 1 - an inverse query.
- 2 - the server status request.
- 5-th bit field - Authoritative answer. The name server is authoritative for the domain in the question section.
- 6-th bit field is set if message truncated. With UDP this means that the total size of the reply exceeded 512 bytes, and only the first 512 bytes the of the reply were returned.
- 7-th bit field - Recursion Desired.This bit can be set in a query and is then returned in the response.
- 8-th bit field - Recursion Available.
- 9-11 -th bits field has to be 0.
- 12-15 -th bits field - Return Code. 0- no error, 3- name error.
- The fields labeled NUMBER OF ... give each a count of entries in the corresponding sections in the message.
- The QUESTION SECTION contains queries for which answers are desired. The client fills in only the question section; the server returns the question and answers with its response. Each question has Query Domain Name followed by Query Type and Query Class fields (as depicted in Fig.7.)
- ANSWER,AUTHORITY,ADDITIONAL INFORMATION sections consist of a set of resource records that describe domain names and mappings. Each resource record describes one name (as depicted in fig.8.).