KONECT  The Koblenz Network Collection
In this way, networks emerge from many different areas in surprising ways: a set of persons connected by friendship links forms a social network; a set of web pages connected by hyperlinks forms a reference network; a community of people sending electronic messages to each other forms a communication network; a set of users giving ratings to movies forms a rating network. The list of possible networks is practically unbounded. Therefore, the Institute for Web Science and Technologies of the University of KoblenzLandau has compiled the KONECT collection of over a hundred network datasets with the goal of using them in the course of ROBUST research activities. The collected datasets will allow researchers in the ROBUST projects for various ends:

ROBUST will study how general common results about large networks are. For instance, a common result in network science is the observation that diameters are small ("six degrees of separation") and edges are clustered ("the friend of my friend is my friend"). KONECT allows for verifying these results on an unprecedented scale.

Typcially, the business communities studied in ROBUST are very large. In fact, they are so large that many conventional network mining algorithms cannot be applied on them. Therefore, ROBUST studies the scalability of network mining algorithms. By applying the network mining algorithms developed in the ROBUST project to realworld networks of varying sizes, the ROBUST team is able to estimate their scalability in practice.

Given the large size of many networks, the KONECT collection will also serve to analyze graph coarsening approachesin ROBUST. The task here is to reduce a given network to a manageable size while maintaining certain characteristics.

Finally, the collection of networks will serve as a testbed for new graph mining algorithms developed in ROBUST, to ensure they are suitable for a large diversity of networks of different categories and types.
The complete list of datasets, as well as detailed statistics for each of them is available online at http://konect.unikoblenz.de. In addition, we provide comparative statistics over all networks, as well as source code used to download the datasets from the Web. Of course, all code and data on the web is available under free licenses (GPLv3 and ccbysa).