DNS: THE DOMAIN NAME SYSTEM

Introduction

The Internet Protocol addresss is a 32- bit integer. If somebody wants to send

a message it is necessary to include the destination address, but people prefer

to assign machines pronounceable, easily remembered names (host names).

For this reason the Domain Name System is used. These logical names also allow independence from knowing the physical location of a host. A host may be moved to a different network, while the users continue to use the same logical name.

The Domain Name System (DNS) is a distributed database used by TCP/IP

applications to map between hostnames and IP addresses, and to provide

electronic mail routing information. Each site (university department,

campus,company, or department within a company, for example) maintains

its own database of information and runs a server program that other

systems across the Internet can query. The DNS provides the protocol which

allows clients and servers to communicate with each other.

The system accesses the DNS through a resolver. The resolver gets the hostname

and returns the IP address or gets an IP address (fig.1) and looks up a hostname.

As we can see in fig.1 the resolver returns the IP address before asking

the TCP to open a connection or sending a datagram using UDP.

DNS Organization

The domain name system uses a hierarchical naming scheme known as domain

names,which is similar to the Unix filesystem tree. The root of the DNS tree

is a special node with a null label. The name of each node (except root) has

to be up to 63 characters.The domain name of any node in the tree is

the list of labels, starting at that node, working up to the root, using

a period ("dot") to separate the labels (individual sections of a name might

represent sites or a group, but the domain system simply calls each section a

label ). The difference between the Unix filesystem and the tree of the DNS is that

in the DNS we start on the ground and "go up" till the root. Writing them

in this order makes it possible to compress messages that contain multiple

domain names. Thus, the domain name "tau.ac.il" contains three labels:

"tau", "ac", and "il". Any suffix of a label in a domain name is also

called a domain. In the above example the lowest level domain is "tau.ac.il"

(the domain name for the Tel-Aviv University Academic organization in

Israel), the second level domain is "ac.il" (the domain name for Academic

organizations of Israel), and the top level domain (for this name) is "il"

(the domain name for Israel). The node il is the second level node

(after root) (Fig.2)

Every node in the tree must have a unique domain name, but the same label

can be used at different points in the tree.

The top-level domains are divided into three areas:

1. arpa is a special domain used for address-to-name mapping.
2. The seven 3-character domain names ( generic (organizational) domains).
3. The 2-character domains are based on the country codes. These are called the country (the geographical) domains.

The seven generic domains are depicted in the fig.3 :

 Domain Name    	Meaning
    COM		Commerical organizations
    EDU		Educational institutions
    GOV		Government institutions
    MIL		Military groups
    NET		Major network support centers
    ORG		Organizations other than those above
    INT		International organizations

Fig.3 The three-character generic domain

The Internet scheme can accomodate a wide variety of organizations,

and allows each group to choose between geographical or organizational

naming hierarchies. Most sites follow the Internet scheme so they can

attach their TCP/IP installations to the connected Internet without changing

names.

The zone is a subtree of the DNS that is administered separately.

A common zone is a second-level domain, "ac.il" for example. Thus a lot of

second-level domains divide their zone into smaller zones.

Whenever a new system is installed in a zone, the DNS administrator for the

zone allocates a name and an IP address for the new system and enters these

into the name servers database. A name server is said to have authority

forone zone or multiple zones. Often, server software executes on a

dedicated processor, and this computing machine is called the name Server.

The person responsible for a zone must provide a primary name server for

that zone and one or more secondary name servers. The main difference

between a primary and a secondary is that the primary loads all the

information for the zone from disk files, while the secondaries obtain

all the information from the primary. When a secondary obtains the

information from its primary it is called a zone transfer.

When a new host is added to the zone, the administrator adds the

appropriate information (name and IP address) to a disk file on the system

running the primary. The primary name server is then notified to reread

its configuration files. The secondaries query the primary on regular

basis (normally every 3 hours) and if the primary contains newer data,

the secondary obtains the new data using a zone transfer.

If the name server does not contain the information requested, it must

contact another name server. Not every server, however, knows how to contact

every other server. Instead, every name server must know how to contact the

root name servers. The root servers then know the name and location

(i.e. IP address) of each authoritative name servers for all the

second-level domains. There are six root servers in the world and every

primary name server has to know the address of one of root server.

The fig. 2 we can depict the tree of servers fig.4.

In practice, the organization often collects information from all of

their sub-zones into a single server. Thus we can depict fig.5 which is

more realistic than fig.4.

We have to say that the tree in fig.5 shows how a given server can contact

other servers only. This tree does not indicate physical network connection.

Servers may be located at arbitrary locations on the network. Therefore,

the tree of servers is a logical conection between servers, which uses the

Internet for communication.

<hl>DNS Caching</hl>

A fundamental property of the DNS is caching. That is, when a name server

receives information about a mapping, it caches that information.

Thus a later query for the same mapping can use the cached result, and not

result in additional queries to other servers. The DNS uses the caching for

optimizing search cost.

How does it work?

Every server has a cache for recently used names as well as a record of

where the maping information for that name was obtained. When a client ask

the server to resolve certain name the server does as follows:

Check if it has authority for the name. If yes, the server does not need caching information.
if not, the server checks its cache whether the name has been resolved recently. if yes, the server reports the caching information to its clients.

We can examine the cache when the server cashed the information once, but did not

change it. Since information about a particular name can change, the server

may have incorrect information in its caching table.

The Time to Live (TTL) value is used to decide when to age information.

Whenever an authority responds to a request, it includes a TTL value in the

response which specifies how long it guarantees the binding to remain.

DNS MESSAGE FORMAT

When the user wants to send a message, it invokes an application program and

supplies the name of a machine with which the application must communicate.

The application program must find the machines IP address. It passes the

domain name to a local resolver (L.R.) and requests an IP address. The local

resolver checks its cache and:

If the L.R. has an answer, it returns the answer.
If the L.R. has not one, it sends the message to the server. The server then returns a similar message that contains the answer to the questions for which the server has bindings. If the server can not answer, it sends responsive information about other servers that the client can contact. Fig.6 shows the DNS message format. Explanation of Fig.6:
- The IDENTIFICATION is set by the client and returned by the server.
- The 16-bit PARAMETER consists of:
  - 0-th bit field -qr: 0 means the message is a query,1 means it is a response.
  - 1-4 bit fields - OPCODE:
    - 0 -is a normal value (Standard query).
    - 1 - an inverse query.
    - 2 - the server status request.
  - 5-th bit field - Authoritative answer. The name server is authoritative for the domain in the question section.
  - 6-th bit field is set if message truncated. With UDP this means that the total size of the reply exceeded 512 bytes, and only the first 512 bytes the of the reply were returned.
  - 7-th bit field - Recursion Desired.This bit can be set in a query and is then returned in the response.
  - 8-th bit field - Recursion Available.
  - 9-11 -th bits field has to be 0.
  - 12-15 -th bits field - Return Code. 0- no error, 3- name error.
- The fields labeled NUMBER OF ... give each a count of entries in the corresponding sections in the message.
- The QUESTION SECTION contains queries for which answers are desired. The client fills in only the question section; the server returns the question and answers with its response. Each question has Query Domain Name followed by Query Type and Query Class fields (as depicted in Fig.7.)
- ANSWER,AUTHORITY,ADDITIONAL INFORMATION sections consist of a set of resource records that describe domain names and mappings. Each resource record describes one name (as depicted in fig.8.).
The RESOURCE DOMAIN NAME contains the destination name, and can be in an arbitrary length. The TYPE field specifies the type of the data record. The CLASS field specifies its class. The TIME TO LIVE field contains an integer that specifies the number of seconds information in this resource record can be cached. It is used by clients who have requested a name binding and may want to cache the results.The RESULTS DATA LENGTH field specifyies the count of octets in the RESOURCE DATA field.

Back

Das DNS