Go to Google Home
A data-code-compute resource for research and education in information visualization
InfoVis Home Learning Modules Software Databases Compute Resources References

Learning Modules > Pajek Tutorial

Download Information | File Data Format | Pajek Execution | Set Threshold Value | Zoom-in Feature | Layout of Extremely Large Networks

Download Information

Pajek is a program for analyzing large networks, and arguably the best drawing program on the market. It is freeware software and can be downloaded from http://vlado.fmf.uni-lj.si/pub/networks/pajek/.

On the web page, click on the first link available on the first line saying "Pajek 0.91". This provides an option to download "pajek91.exe" to your local hard drive. Click on the executable file and follow the installation instructions. After the successful installation of Pajek, a shortcut to run Pajek and a folder named "Pajek" containing all the relevant libraries for the program are available.

Subsequently, we discuss Pajek's input file format, parsers that help you generate that format, as well as how to use Pajek for simple data layout.

File Data Format

The file format accepted by Pajek provides information on vertices, arcs (directed edges), and undirected edges. A short example showing the file format is given below:
*Vertices 3
1 "Doc1" 0.0 0.0 0.0 ic Green bc Brown
2 "Doc2" 0.0 0.0 0.0 ic Green bc Brown
3 "Doc3" 0.0 0.0 0.0 ic Green bc Brown 
1 2 3 c Green
2 3 5 c Black
1 3 4 c Green
In the example there are 3 vertices Doc1, Doc2 and Doc3 denoted by numbers 1, 2 and 3. The (fill) color of these nodes is Green and the border color is Brown. The initial layout location of the nodes is (0,0,0). Note that the (x,y,z) values can be changed interactively after drawing.

There are two arcs (directed edges). The first goes from node 1 (Doc1) to node 2 (Doc2) with a weight of 3 and in color Green. 

For edges, there is one from node 1 (Doc1) to node 3 (Doc3) of weight of 4, and is colored green.

Imagine you want to layout a set of nodes according to a given similarity matrix. Given the similarity matrix, e.g., the sample file generated using Latent Semantic Analysis, you can use a Perl parser pajekConv.pl to generate the Pajek input file. To execute the Perl scrip on 'ella' or 'iuni', simply type at the command prompt:

perl pajekConv.pl inputFileName outputFileName 

Make sure to replace inputFileName with the name of the similarity matrix file. The generated Pajek input file will be named outputFileName.

Pajek Execution

Start the Pajek Program by clicking on the "Pajek" shortcut icon. The interface shown in Figure 1 will pop up.

filtered interface
Figure 1: Initial interface for Pajek to perform file reading operation

Specify the Pajek input 'Network' file by clicking on the yellow folder icon under 'Network' (see area highlighted in red on the left in Figure 1). Reading of the Network will be confirmed as shown in Figure 2.

fileread interface 2
Figure 2: Initial file read feedback interface

To display the network, go to the Draw (Ctrl + G) menu option (highlighted on the top right in Figure 1). The resulting initial layout is shown in Figure 3.

Figure 3: Initial layout of nodes

In order to layout the nodes according to their similarity - as shown in the similarity matrix discussed above - you can apply different layout algorithms available via the "Layout" menu option. A feature to adjust the starting point of the algorithm is also provided (see Figure 4).

Figure 4:
Algorithms and starting positions available for the data

In addition, you can choose to show/hide the node information, edges etc. using the "Options" menu shown in Figure 5.

Figure 5: Options menu

Explore the options for lines, vertices, color etc.

Set Threshold Value

In situations where a big mesh of lines connects the nodes, the graphics can be improved by running the pajek_setThreshold.pl script on the similarity matrix previously generated. The way this script works is that tit turns all the values (including the threshold value) to zero and thus removes the edge between the two nodes. After this step, you would have to repeat the same steps of using a pajekConv.pl file to get the pajek input format file.

Useful Tip:
In the case of a very large data file, the algorithm processing and layout readability can be improved by hiding vertice labels and edges among vertices.

If you want to make the graph a bit more colorful, then you can specify the vertices colors in the input file. The term "ic" implies internal color and "bc" is the external color. All color options visible in pajek are shown in Color.pdf. The input file would then have the following format in the vertices section:

1 "MOLECULAR SEQUENCE DATA" 0.1 0.1 0.1 ic Orange bc Mahogany

In addition, if you want to increase the size of the vertices, you also have to define the values in the vertices section using two variables: "x_fact" and "y_fact". The example below illustrates the declaration method:

1 "MOLECULAR SEQUENCE DATA" 0.1 0.1 0.1 x_fact 7 y_fact 7 ic Orange bc Mahogany

Note: These changes may sometimes sometimes be visible in the Pajek GUI. To activate different sizes in the Draw window you must select Options/Mark vertices using -> Real sizes On from the GUI menu.

The snapshot shows resulting viz with variable nodes size, different edge widths and color combination. The visualization can be generated from the following pajek input file.

Zoom-in Feature

The zoom in feature is available through pajek GUI. Press the right mouse button and drag it over the area of interest. This provides a zooming feature. Inorder to zoom-out, press the "Redraw" button on the menu bar.

Additonal information on different variables that are available to make relevant changes in the visualization can be located in the following documentation

If you wish to explore some of the additional features that Pajek has to offer, you might consider going through the extensive manual.

Layout of Etremely Large Networks

See http://vlado.fmf.uni-lj.si/pub/networks/pajek/howto/extreme.htm



This documentation was compiled by Ketan Mane and Sidharth Thakur.

Information Visualization Cyberinfrastructure @ SLIS, Indiana University
Last Modified Sept 7, 2004