Hebrew University
Computer Science
DANSS Lab

Home

Network Gauge

Download

Usage

Metrics

Screenshots

 

 

 

 

 

 

The Metrics

In order to measure distances we use a combined form of several known metrics. These are divided to static and dynamic metrics. The static metrics are helpful since their computation time is short, and require no network connection or other kind of assistance. However, they give limited information, and usually only determine when the other node is very close. These methods are:

IP Address Difference
We take the length of the longest prefix shared between the two nodes IP addresses. The longer the shared prefix, the closer the nodes. Unfortunately, values under 8 or even 16 are usually not helpful, as the distribution of IP addresses is random around the globe. However, when the values are above 16, it means that the two nodes are in the same B class or C class sub-net. This indicates that the nodes are pretty close, usually in the same ISP or AS.
DNS Name Difference
he DNS names of the nodes are parsed to tokens. These token lists are compared to get the largest common suffix. for example, the node www.cs.huji.ac.il is parsed to {www, cs, huji, ac, il}. It has a common suffix of 2 with www.cs.tau.ac.il (the ac.il suffix) and a common suffix of 0 with www.cnn.com. The rationale behind this method is that close nodes have similar DNS names. Also, due to the country suffix used in all the countries (besides the United States) it is easy to determine if the nodes are close geographically.
Global domain names, such as .com, .net or .org pose a problem in this case, since they are popular and wanted around the world. Unlike country specific suffix, these suffixes give no indication whether the nodes are close when the common suffix is 1 (for example www.washingtonpost.com and www.nytimes.com has no relation, however www.w3.org and validator.w3.org located very close (their IP addresses are 128.30.52.25 and 128.30.52.13 respectively). This characteristic demands special treatment by the measuring code. Our solution was to convert a distance of 1 between two names that ends with .com, .net or .org to 0, as if there was no relation at all. The one exception here is the .edu suffix, unique to universities in the United States, so that the distance between www.stanford.edu and www.mit.edu is 1.

Using this metrics can be used to determine whether two nodes are in the same organization (which is generally very good, communication wise) or located in the same country which is good from our locality aware perspective.

The dynamic metrics are divided to two types. The first exploits network resources alone, and needs no cooperation from the other node. This group includes the following metrics:

Ping / Latency
Ping is a small program used to check the availability of another node in the network. It returns the time in milliseconds of the round trip time (RTT) of an ICMP message sent to the other node. A smaller value indicates better connection to the other host.
We have found that ping has one major drawback when experimenting in the Planet-Lab environment. Ping is used to perform distributed denial of service (DDoS) attacks on Internet hosts, and therefore it is blocked by some firewalls. The naive user might think that the other host is unavailable, as the ping fails, but normal TCP connections usually work fine. This has forced us to consider alternative ways for latency measurements, as described below.
Number of Hops
Using the traceroute program, we are able to determine the number of hops a TCP or UDP packet has to pass until it reaches the other node. This can give us an indication where a connection might be faster, since it travels through a smaller number of routers.
Number of Autonomous Systems
The Internet is divided to autonomous systems (AS), each is a collection of several IP networks under control of a single entity, typically an Internet Service Provider (ISP) or alike. Connection inside an autonomous system are usually faster than connections that span over several autonomous systems, so measuring only number of hops can be sometimes misleading. To overcome this, we also measure the number of autonomous systems separating between the hosts. We take the output of the number of hops measure, and associate each hop with its AS. Doing so gives us a list of AS instead of IP addresses. We then check how many times there are changes on the list and this gives us the number of AS the packets go through.

The second type of dynamic metrics requires some daemon on the other node, in order to perform the measurements. This group includes the following metrics:

TCP Ping
As mentioned above, using the regular ping poses a problem due to the tendency of system administrators to block ping queries as described above. The blocking is usually done at the receiving side (the computers behind the firewall does not return the ``pong''), but we have found several computers that where blocked from sending the ping (or receiving the ``pong''). To overcome this obstacle, we use a ''ping over TCP'' mechanism. Instead of measuring the RTT of the ICMP packet, we measure the time it takes to send one byte and receive one byte from the other node. Due to the characteristics of TCP, this version of ping should be slower than the original version, but measures have shown that over large distances the difference is small. Also, it still gives the ability to compare different host by their response times.
Connection Creation Time
This metrics measure how long it takes to open a connection to the other host, send a byte and receive a byte back. This helps us to determine how quickly a connection between two hosts can be established.
Upload/Download Bandwidth
The topology of the Internet does not match physical distance with good connection. The most common way to determine if two hosts in the network are well connected is to measure the bandwidth of the data traffic between them. We measure the bandwidth from two different viewpoints: the upload bandwidth and the download bandwidth. In regular clients, the metrics measures the regular network traffic (such as routing table update) without interfering with it. In the network distance gauge tool, a 256KB chunk of data was transferred to measure the bandwidth. This size was selected since it is a common 'block' size in file sharing networks.

These metrics can be combined and manipulated in order to give a general metrics distance. The idea is to take several metrics and combine them to a single distance metric. In order to achieve that goal we have used several known design patterns to help us in the task:

Decorator
The first problem using the above metrics is the heterogeneous units and scales. Comparing number of hops to bandwidth is like comparing apples to oranges. Another problem is that a distance metric obeys the ``lower value means closer'' rule. However, some of the metrics (such as bandwidth and DNS name difference) have the opposite characteristics: the lower the value, the larger the distance. The different units pose a problem of their own. How to compare the values of DNS name difference (usually between 0 and 5) and ping (whose range is between 0 and more than 1000 milliseconds)? The answer to that was to use decorators. Each decorator is viewed as a metric to other parts while wrapping another metric and manipulates its returned value. For example a decorator on the number of AS metric can expand the measured value (usually between 0 and 12) to be in the range of 0 and 100.
As for bandwidth, a Value of 1000Kbps which is considered very close and will be transformed to a value of 0.1, while a value of 1Kbps is be considered far and will be transformed to a value of 100. The function between them can be either linear, or take the form of distancex=100/x.
Composite
Our goal is to give the metrics user a single metric that gives a single number representing the distance. This metric might be composed out of several other metrics with different weights, e.g. use 50% ping, 30% number of hops and 20% connection creation. The user of the metrics should be able to switch them and compose them freely. To achieve that, we propose a composite metrics, implemented in the class MultiMetrics. MultiMetrics contains a weighted vector of other metrics, and implements the Metrics interface itself. A simple use of MultiMetrics will give us a weighted average of other metrics, but a more complex metrics hierarchy can be easily built.

 

Original site design by Pegasus Web Design Resources, modified by David Rabinowitz