Protocols and Algorithms


Minor Details
1) No central authority required as in the case of all other p2p tools. Essentially all nodes in the system are equivalent.
2) A node joining the system just needs information about one single existing node of the system.
3) Network can appear different to different nodes.
4) Each node maintains a vicinity list (closest square root n nodes ) and a color list containing all nodes of color same as itself.(Color essentially is a hash value based on ip and port number)
5) Distance here means a weighted sum based on ping, traceroute, DNS, IP difference and bandwidth.
6) Each node maintains an estimate of the network (node maintains the value of log(n))

Joining protocol
Node (A) joins the network.

(A) contacts the known node (B) and starts gathering information about the network. First step is the calculation of rough or initial vicinity.The initial estimate of vicinity is a good approximation practically, but we need to have the exact vicinity and this is done by maintaining soft states and over period of time a node gossips with other nodes in the network and improves.

    Initial approximation of vicinity is done considering the fact that two very close nodes in the system will have sizeable part of their vicinities overlapping.

Initial Approximation of Vicinity:-

Locate the closest node(D) of some color other than color of 'A'
Find the node(C) closest to 'A' from the vicinity list of 'D' and copy
vicinity list of 'C' into 'A'.
(Different variations of the strategy are possible.)


Improvement in Vicinity :-

Mechanism - I
Choose a random node in vicinity and merge its vicinity with your own vicnity.


Mechanism - II
Choose a Node(X) of random color(c) from vicinity. Calculate the distance of the closest node of same color from the vicinity table and the distance of the farthest node in the viccinity (let this be 'a' and 'b' units respectively)
Get all nodes of color (c) from (X) which are at a distance less then (a+b) units from (X).

Mechanism - III
Contact a random node in your color list and ask if there is any new node in the network of your color. Update the color list.

If we assume the number of messages to reflect the complexity of the algorithm then it can be proven that it is sub-linear in nature. All three mechanisms need to run simultaneously to gurantee accurate vicinity.


Estimate of network size

Method - I
Exchange size estimate (log n) with some random node. (Ways to choose random node)

Method - II
A node's estimate is entirely based on its view of the network. Node changes the size estimate based on the number of nodes in its color list. We increase the size estimate if the number of nodes in color list is greater then sqrt(n). Node decreases its estimate by one if the number goes down below sqrt(n)/sqrt(2).


Contact Maintenance
    The number of contacts maintained is dependent on a node's estimate of the network (log n). (We maintain (log n)*sqrt(n) contacts in vicinity list. This essentially means we maintain roughly (log n) links of each color.). Hence, the maximum number of contacts is fixed for a particular value of (log n), yet the gossip mechanism supplies potential contacts continuously.


Multi-hop Query routing
    Theoretically a node lookup should result in at max 2 hops or a route with a maximum stretch of 3 units. We try to measure this practically and find that there can be discrepancies with the theoretical result. A node does not always have information about all the existing colors of the system. In such cases the query can be processed in several ways :
a) request and execute the same query on (closest/farthest/random) node(s) in
color list.
b) contacting (closest/farthest/random) node(s) in vicinity list and executing
the same query.

Points to ponder :-
Experimental results
Fault tolerance
Load balancing
Files (The basic idea of lookup of node is valid for file also. We can treat the file as a hash value for practical results as of now. We need not replicate file information as in the case of other DHTs)