This published dataset consisting of 4 Google+ snapshots is a subset of the dataset studied in our IMC'12 paper. Each snapshot includes both directed social structure and node attributes, which can be represented by the following Social-Attribute Network. Snapshots 3 and 4 were crawled after Google+ was opened to the public.
|#Social nodes||#Social links||#Attri nodes||#Attri links||Crawled time||TimeID|
|snapshot 1||4,693,129||47,130,325||991,545||3,644,103||Jul., 2011||0|
|snapshot 2||17,091,929||271,915,755||3,108,141||14,693,125||Aug., 2011||1|
|snapshot 3||26,244,659||410,445,770||4,147,389||19,344,382||Sep., 2011||2|
|snapshot 4||28,942,911||462,994,069||4,443,631||20,592,962||Oct., 2011||3|
Directed social structure
UserIDFrom UserIDTo TimeID
Each line corresponds to a directed link. UserIDs are anonimyzed to be integers starting from 0. TimeID is 0, 1, 2 or 3, indicating the snapshot in which this directed link first appears.
UserID AttriID TimeID
Each line corresponds to an undirected attribute link. AttriID are anonimyzed to be negative integers starting from -1. Again, TimeID is 0, 1, 2 or 3, indicating the snapshot in which this link firstappears.
Each line corresponds to an attribute. AttriType could be employer, school, major or places_lived.
Reconstructing the tth Snapshot
To obtain the tth snapshot, you should keep all edges whose TimeIDs are less than t, where t=1,2,3,4.
- Neil Zhenqiang Gong and Wenchang Xu. "Reciprocal versus Parasocial Relationships in Online Social Networks". Springer Social Network Analysis and Mining (SNAM), 4(1), 2014.
- Neil Zhenqiang Gong, Wenchang Xu, Ling Huang, Prateek Mittal, Emil Stefanov, Vyas Sekar, and Dawn Song. "Evolution of Social-Attribute Networks: Measurements, Modeling, and Implications using Google+". ACM/USENIX Internet Measurement Conference (IMC), 2012. Acceptance rate: 45/183=24.6%.
- Neil Zhenqiang Gong, Ameet Talwalkar, Lester Mackey, Ling Huang, Richard Shin, Emil Stefanov, Elaine Shi, and Dawn Song. "Jointly Predicting Links and Inferring Attributes using a Social-Attribute Network (SAN)". In ACM Workshop on Social Network Mining and Analysis (SNA-KDD), co-located with KDD, 2012.
Downloading the Dataset
Click here to download.
Note: the raw dataset was originally crawled by Emil Stefanov, Richard Shin, and Elaine Shi, and then processed by Neil Gong.