Identifying research student collaboration

This post presents information and results from a poster and paper about my work understanding collaboration by research students.

Paper: Rolf H. (2019). Identifying the Collaboration Styles of Research Students. Proceedings of the Association for Information Science and Technology. In press. Download the draft paper PDF or the poster (Download PDF 3MB).

The poster presents a co-author network created from a sample of 1216 publications authored by research students at the University of Tasmania from 2007-2015, the publication metadata was retrieved from Elsevier’s Scopus abstract and citation database. Visualisation and statistical analysis were conducted using the Gephi software package version 0.9.2 and the R software environment for statistical computing and graphics version 3.6.0.

The co-author network

The co-author network contains 3024 co-authors (n=779 research students) with 26414 links between them. The network’s giant component (the largest connected group of co-authors) contains 83% of all co-authors (Figure 1).

Clustering of co-authors

To summarise the networks topology an information-theoretic clustering algorithm was used to decompose the co-author network into 277 clusters (162 in the giant component) of closely interconnected co-authors (Rosvall & Bergstrom, 2007). The algorithm was chosen because the clusters that it generates have been found to detect functional research teams (Velden, Haque, & Lagoze, 2010).

Defining co-author roles

Co-authors were categorised into a series of seven universal roles by comparing a co-author’s within-cluster degree (that is the number of links bet ween a co-author and other co-authors in the cluster) using a z-score (z) and the distribution of a co-author’s degree to co-authors in other clusters using a participation coefficient (P) (Figure 2) (Guimera, Sales-Pardo, & Amarl, 2007).

Figure 2: Scatter plot of 3024 co-authors located according to their z-score (y) and Participation Coefficient P (x). A colour is associated with each role R1 to R7. See Guimera et. al., (2007) for thresholds

Identifying collaboration styles

Clusters were categorised into 6 groups according to the proportion of total links attributed to each role in a cluster (Figure 3). Each group is characterised by a role profile, which is defined by a principle role and the density of co-author links (Figure 4). The presence or absence of roles in a cluster and the variation of degree among roles reveal different styles of collaboration on the network.

Figure 3: Principle Component Analysis (PCA) bi-plot of 162 clusters placed according to the total proportion (%) of links attributed to each role in the cluster (the greater the % of links a role has the more influence it has on where a cluster is placed). Clusters were categorised into 6 groups using hierarchical clustering (Ward method). Each group is characterised by a principle role and coloured according to Figure 2, with the exception of group 3 where clusters have a mixed profile
Figure 4: Combined density & box plot of clusters 4, 7, 8, 70, 101, 243. Each cluster is an example of a group profile in Figure 3. The density plot shows the distribution of co-authors according to their degree and the box plot shows the variation in co-author degree. Degree is the total number of links from a co-author to other co-authors on the network

Results of analysis

(1) The low graph density and high clustering coefficient of the co-author network in Figure 1 reveals a dispersed network of many locally interconnected clusters.

(2) On the network, research students occupy the bonding (homophilic) non-hub roles (R1-R3) that act as local organisers while non-students tend to occupy the hub roles (R4-R7) that act as the bridges, brokers and gatekeepers to clusters across the network.

(3) The networks role-role link profile of co-authors in Figure 1 is indicative of a stringy-periphery class of network (Guimera et. al., 2007). R1 co-authors are highly connected to one another (69.85% of all links), many R5 and R6 hub co-authors are directly connected to one another forming cores of hubs and less connected to R1 co-authors than would be expected by chance (2.48% of all links).


[1] Guimera, R., Sales-Pardo, M., & Amarl, A. N. L. (2007). Classes of complex networks defined by role-to-role connectivity profiles. Nature Physics, 3(1), 63–69.

[2] Rosvall, M., & Bergstrom, C. T. (2007). An information-theoretic framework for resolving community structure in complex networks. Proceedings of the National Academy of Sciences, 104(18), 7327–7331.

[3] Velden, T., Haque, A., & Lagoze, C. (2010). A New Approach to Analyzing Patterns of Collaboration in Co- authorship Networks – Mesoscopic Analysis and Interpretation. Scientometrics, 85(1), 219–242.