Protocols and
Algorithms
Minor Details
1) No central authority required as in the case of all other p2p tools.
Essentially all nodes in the system are equivalent.
2) A node joining the system just needs information about one single
existing node of the system.
3) Network can appear different to different nodes.
4) Each node maintains a vicinity list (closest square root n nodes )
and a color list containing all nodes of color same as itself.(Color essentially is a hash value based on ip and port number)
5) Distance here means a weighted sum based on ping, traceroute, DNS,
IP difference and bandwidth.
6) Each node maintains an estimate of the network (node maintains the
value of log(n))
Node (A) joins the network.
(A) contacts the known node (B) and starts gathering information about
the network. First step is the calculation of rough or initial
vicinity.The initial estimate of vicinity is a good approximation
practically, but we need to have the exact vicinity and this is done by
maintaining soft states and over period of time a node gossips with
other nodes in the network and improves.
Initial approximation of vicinity is done
considering the fact that two very close nodes in the system will have
sizeable part of their vicinities overlapping.
Initial
Approximation of Vicinity:-
Locate the closest node(D) of some color other than color of 'A'
Find the node(C) closest to 'A' from the vicinity list of 'D' and copy
vicinity list of 'C' into 'A'.
(Different variations of the strategy are possible.)
Improvement
in Vicinity :-
Mechanism - I
Choose a random node in vicinity and merge its vicinity with your own
vicnity.
Mechanism - II
Choose a Node(X) of random color(c) from vicinity. Calculate the
distance of the closest node of same color from the vicinity table and
the distance of the farthest node in the viccinity (let this be 'a' and
'b' units respectively)
Get all nodes of color (c) from (X) which are at a distance less then
(a+b) units from (X).
Mechanism - III
Contact a random node in your color list and ask if there is any new
node in the network of your color. Update the color list.
If we assume the number of messages to reflect the complexity of the
algorithm then it can be proven that it is sub-linear in nature. All
three mechanisms need to run simultaneously to gurantee accurate
vicinity.
Estimate
of network size
Method - I
Exchange size estimate (log n) with some random node. (Ways to choose
random node)
Method - II
A node's estimate is entirely based on its view of the network. Node
changes the size estimate based on the number of nodes in its color
list. We increase the size estimate if the number of nodes in color
list is greater then sqrt(n). Node decreases its estimate by one if the
number goes down below sqrt(n)/sqrt(2).
Contact
Maintenance
The number of contacts maintained is dependent on a
node's estimate of the network (log n). (We maintain (log n)*sqrt(n)
contacts in vicinity list. This essentially means we maintain roughly
(log n) links of each color.). Hence, the maximum number of contacts is
fixed for a particular value of (log n), yet the gossip mechanism
supplies potential contacts continuously.
Multi-hop
Query routing
Theoretically a node lookup should result in at max
2 hops or a route with a maximum stretch of 3 units. We try to measure
this practically and find that there can be discrepancies with the
theoretical result. A node does not always have information about all
the existing colors of the system. In such cases the query can be
processed in several ways :
a) request and execute the same query on (closest/farthest/random)
node(s) in
color list.
b) contacting (closest/farthest/random) node(s) in vicinity list and
executing
the same query.
Points to
ponder :-
Experimental results
Fault tolerance
Load balancing
Files (The basic idea of lookup of node is valid for file also. We can
treat the file as a hash value for practical results as of now. We need
not replicate file information as in the case of other DHTs)