Go to Google Home
Go
A data-code-compute resource for research and education in information visualization
InfoVis Home Learning Modules Software Databases Compute Resources References

Learning Modules > Search Performance of P2P Networks

Description | Usage Hints | Learning Task | Discussion | References | Acknowledgments


line
Description

Peer-to-Peer (P2P) networks have gained much popularity with the success of MP3 file-sharing systems such as Gnutella. P2P networks differ from traditional networks in that each node has 'equivalent capabilities and responsibilities'. This is advantageous for systems in which nodes (dis)connect or fail at random as the entire system will still remain functional (see the learning module on Error and Attack Tolerance). However, it poses challenges for the design of efficient 'decentralized' search algorithms.

This learning module introduces basic network modeling algorithms for structured and unstructured P2P networks. Next you will examine different search algorithms that use only local knowledge about immediate neighbors to find a certain node in the network. You will learn that there is a strong relationship between network topology, search algorithms, and search efficiency.


line
Usage Hints

The InfoVis code repository provides different algorithms to model and analyze different structured and unstructured P2P networks. Check out the below links to learn more about them.

Structured Networks

Unstructured Networks Search Algorithm

line
Learning Task

Model a P2P Network
P2P systems can be classified into two categories by the manner in which decentralization is realized: structured systems and unstructured systems.

In structured P2P systems, placement of system resources at nodes is strictly controlled in a centralized fashion. Centralized control limits network robustness and node autonomy. Structured P2P models are advantageous for building systems where controlled resource placement is a high priority, such as distributed file storage. It is not recommended for systems with highly dynamic membership. The main advantage of structured systems is that the added constraints result in sub-linear search mechanisms.

Unstructured systems are characterized by a complete lack of constraints on resource distribution and network growth policies. These systems focus on growing a network with the desirable low diameter of small world systems using only local information. They strive for complete decentralization of network growth and are topologically more robust.

Build a CAN (structured network) and PRU (unstructured network) model of the same size, say 32 nodes.

CAN modeling algorithm takes the network size as an input parameter. So input '32' as the number of nodes in the network. PRU modeling algorithm reads the network size, inCache nodes, minimum and maximum degree of nodes in the network as input parameters. The inCache nodes refer to the number of nodes that are to be kept in the cache and thus are the highly connected nodes. The minimum and maximum degree, as the names suggest, indicate the minimum and maximum number of edges allowed for a node in the network. Recommended InCache Nodes = M/4, Minimum Degree = log M, Maximum Degree = 3log M + 3 where M is the network size.

Visualize the networks you generated using Visualization > Kamada-Kawai Layout.

Analyze a P2P Network

One major feature of a P2P network is the efficiency in which it can be searched. Here search is defined as finding a path from a start node to a destination node. The cost of a search is the number of edges traversed in locating the destination node (i.e. the number of "messages" sent in the network during the search process). To analyze search efficiency, one simply runs a large number of searches over the network and averages the results.

All structured P2P networks have their own search strategies that exploit the network topology. For search on unstructured networks, generic search algorithms are used typically.

For the CAN model, perform a search on the network using CAN Search algorithm and p-rand Breadth-First Search algorithm. For the PRU model, perform search using p-rand Breadth-First Search algorithm. Compare the performance of the search algorithms on these two topologies.


line
Discussion

In this learning module you used network modeling algorithms and network search algorithms to compare search performance on different network topologies. Answer the subsequent questions for yourself:

  • How fast is the search on structured networks as compared to that on unstructured networks?
  • What search algorithm gives the best performance across all the network topologies, whether structured or unstructured?
  • On the structured networks, how fast is the search using native search mechanism compared to the generic ones?

line
References

See references for the CAN, Chord, PRU, Hypergrid, k-Random Walk and p-rand BFS.


line
Acknowledgments

This learning module was compiled by Hardik Sheth and Katy Börner.

line
Information Visualization Cyberinfrastructure @ SLIS, Indiana University
Last Modified Aug 8, 2004