Large Scale Content Distribution Project

The following links illustrate download process of a file of size 130MB which is distributed into many (50-250) clients concurrently. We let one node be a source, and all other nodes who wish to download the file communicate with the source node. The source node informs the client nodes of all of the other nodes participating in the download. The file is divided into pieces called chunks, where the source node decides which chunk to give to each requesting node. Each chunk size is 250KB, which means we have 518 chunks.

The node selection strategy and chunk selection strategy determine to which node the client connects to next, and what chunk to exchange with it. Each node keeps a record of each chunk's download start and finish time. Based on these records, we created a `movie' of the download process. The movie is composed of one picture per second in the LAN and one picture per 10 seconds in the WAN. The y-axis represents the machines, and the x-axis represents chunks. The color red (dark grey on B/W prints) represents a missing chunk, and green (light grey) represents an already obtained chunk. At the beggining, all machines have all the chunks missing, so the initial graph is all red. As the download progresses, more and more chunks become green. The order in which the chunks are exchanged is determined by the selection strategies above. Each chunk selection strategy has a distinctive fingerprint. In the Round Robin chunk selection strategy, the source sends chunks to clients in order, from first to last. The movie corresponding to this strategy looks like a wave which expands from the first chunk to the last. In the Random strategy, the source sends chunks to clients at random, In the movie here, vertical green lines gradually appear in random positions. These represent chunks that were obtained by all nodes. The optimal situation is to have maximal difference among nodes, so when they connect each other they can exchange as many chunks as possible. The opposite case is that all nodes have the same chunks, so they are not able to progress in the download without contacting the source again.

Bittorrent Protocol

LAN (September 3)

Julia Protocol

WAN (August 25rd)

WAN (August 23rd)

LAN (August 23rd)

WAN (August 1st)

WAN (July 31st)

LAN (July 22nd)

Back to DANSS lab homepage