|
The Metrics
In order to measure distances we use a combined form of several
known metrics. These are divided to static and dynamic metrics. The
static metrics are helpful since their computation time is short, and
require no network connection or other kind of assistance. However,
they give limited information, and usually only determine when the
other node is very close. These methods are:
- IP Address Difference
- We take the length of the longest prefix shared between the two nodes
IP addresses. The longer the shared prefix, the closer the nodes.
Unfortunately, values under 8 or even 16
are usually not helpful, as the distribution of IP addresses is random
around the globe. However, when the values are above 16, it means
that the two nodes are in the same B class or C class sub-net. This
indicates that the nodes are pretty close, usually in the same ISP
or AS.
- DNS Name Difference
- he DNS names of the nodes are parsed to
tokens. These token lists are compared to get the largest common suffix.
for example, the node www.cs.huji.ac.il is parsed to {www, cs, huji,
ac, il}. It has a common suffix of 2 with www.cs.tau.ac.il (the
ac.il suffix) and a common suffix of 0 with www.cnn.com. The rationale
behind this method is that close nodes have similar DNS names.
Also, due to the country suffix used in all the countries (besides
the United States) it is easy to determine if the nodes are close
geographically.
Global domain names, such as .com, .net or .org
pose a problem in this case, since they are popular and wanted
around the world. Unlike country specific suffix, these suffixes give
no indication whether the nodes are close when the common suffix is 1
(for example www.washingtonpost.com and www.nytimes.com
has no relation, however www.w3.org and validator.w3.org
located very close (their IP addresses are 128.30.52.25 and
128.30.52.13 respectively).
This characteristic demands special treatment by the measuring code.
Our solution was to convert a distance of 1 between two names that
ends with .com, .net or .org to 0, as if there was
no relation at all.
The one exception here is the .edu suffix, unique to universities in the
United States, so that the distance between www.stanford.edu and
www.mit.edu is 1.
Using this metrics can be used to determine whether
two nodes are in the same organization (which is generally
very good, communication wise) or located in the same country which is
good from our locality aware perspective.
The dynamic metrics are divided to two types. The first
exploits network resources alone, and needs no cooperation from
the other node. This group includes the following metrics:
- Ping / Latency
- Ping is a small program used to check the
availability of another node in the network. It returns the time in
milliseconds of the round trip time (RTT) of an ICMP message sent to
the other node. A smaller value indicates better connection to the
other host.
We have found that ping has one major drawback when
experimenting in the Planet-Lab environment. Ping is used to perform
distributed denial of service (DDoS) attacks on Internet hosts, and
therefore it is blocked by some firewalls. The naive user might think
that the other host is unavailable, as the ping fails, but normal
TCP connections usually work fine. This has forced us to consider
alternative ways for latency measurements, as described below.
- Number of Hops
- Using the traceroute program, we are able
to determine the number of hops a TCP or UDP packet has to pass until
it reaches the other node. This can give us an indication where a
connection might be faster, since it travels through a smaller number
of routers.
- Number of Autonomous Systems
- The Internet is divided to
autonomous systems (AS), each is a collection of several IP networks
under control of a single entity, typically an Internet Service
Provider (ISP) or alike.
Connection inside an autonomous system are usually faster than
connections that span over several autonomous systems, so measuring
only number of hops can be sometimes misleading.
To overcome this, we also measure the number of autonomous
systems separating between the hosts. We take the output of the
number of hops measure, and associate each hop with its AS. Doing so
gives us a list of AS instead of IP addresses. We then check how
many times there are changes on the list and this gives us the number
of AS the packets go through.
The second type of dynamic metrics
requires some daemon on the other node, in order to perform the
measurements. This group includes the following metrics:
- TCP Ping
- As mentioned above, using the regular ping poses
a problem due to the tendency of system administrators to block ping
queries as described above. The blocking is usually done at the
receiving side (the computers behind the firewall does not return the
``pong''), but we have found several computers that where blocked from
sending the ping (or receiving the ``pong''). To overcome this obstacle,
we use a ''ping over TCP'' mechanism. Instead of measuring the RTT of
the ICMP packet, we measure the time it takes to send one byte and
receive one byte from the other node. Due to the characteristics of
TCP, this version of ping should be slower than the original version,
but measures have shown that over large distances the difference is
small. Also, it still gives the ability to compare different host by
their response times.
- Connection Creation Time
- This metrics measure how long it
takes to open a connection to the other host, send a byte and receive
a byte back. This helps us to determine how quickly a connection
between two hosts can be established.
- Upload/Download Bandwidth
- The topology of the Internet
does not match physical distance with good connection. The most
common way to determine if two hosts in the network are well connected
is to measure the bandwidth of the data traffic between them. We
measure the bandwidth from two different viewpoints: the upload
bandwidth and the download bandwidth.
In regular clients, the metrics measures the regular network
traffic (such as routing table update) without
interfering with it. In the network distance gauge tool, a 256KB
chunk of data was transferred to measure the bandwidth. This size was
selected since it is a common 'block' size in file sharing networks.
These metrics can be combined and manipulated in order to give a
general metrics distance. The idea is to take several metrics and
combine them to a single distance metric. In order to achieve that
goal we have used several known design patterns to help us in
the task:
- Decorator
- The first problem using the above metrics is
the heterogeneous units and scales. Comparing number of hops to
bandwidth is like comparing apples to oranges. Another problem is
that a distance metric obeys the ``lower value means closer'' rule.
However, some of the metrics (such as bandwidth and DNS name
difference) have the opposite characteristics: the lower the value,
the larger the distance. The different units pose a problem
of their own. How to compare the values of DNS name difference
(usually between 0 and 5) and ping (whose range is between 0 and more
than 1000 milliseconds)? The answer to that was to use decorators.
Each decorator is viewed as a metric to other parts while wrapping
another metric and manipulates its returned value. For example a
decorator on the number of AS metric can expand the measured value
(usually between 0 and 12) to be in the range of 0 and 100.
As for bandwidth, a Value of 1000Kbps which is considered very
close and will be transformed to a value of 0.1, while a value of 1Kbps
is be considered far and will be transformed to a value of 100.
The function between them can be either linear, or take the form
of distancex=100/x.
- Composite
- Our goal is to give the metrics user a single
metric that gives a single number representing the distance. This
metric might be composed out of several other metrics with different
weights, e.g. use 50% ping, 30% number of hops and 20% connection
creation. The user of the metrics should be able to switch them and
compose them freely. To achieve that, we propose a composite
metrics, implemented in the class MultiMetrics. MultiMetrics contains
a weighted
vector of other metrics, and implements the Metrics interface itself.
A simple use of MultiMetrics will give us a weighted average of other
metrics, but a more complex metrics hierarchy can be easily built.
|