Syllabus for COMP 596: Network Science

Instructor: Reihaneh Rabbany

Class times: Tuesday & Thursday, 10-11:30

Class location: ENGMC 103

Contact: rrabba@cs.mcgill.ca {add COMP596 in the title}

Office hours: Thursday, 12-1pm

Office: McConnell Engineering Building (MC) 232

Overview

Networks model the relationships in complex systems, from hyperlinks between web pages, and co-authorships between research scholars to biological interactions between proteins and genes, and synaptic links between neurons. Network Science is an interdisciplinary research area involving researchers from Physics, Computer Science, Sociology, Math and Statistics, with applications in a wide range of domains including Biology, Medicine, Political Science, Marketing, Ecology, Criminology, etc. In this course, we will cover the basic concepts and techniques used in Network Science, discuss the state of the art techniques, and will prepare you to do research in this area.

Relevant Textbooks

Slides will cover chapters from the following books but you are not required to buy any of them.

[NI]  Networks: An Introduction by M.E.J. Newman, ebook at library 

[NS] Network Science by Albert-Barabasi, available online

[NC] Networks, Crowds and Markets by D. Easley and J. Kleinberg, available online

[MD] Mining of Massive Datasets by Jure Leskovec, Anand Rajaraman, Jeff Ullman, available online

The required readings per each topic will be accessible online, mostly in the form of surveys, and conference papers.

Schedule

topic

deadlines

readings

1

Tue., Sep. 3

Introduction

NI chapters 1 to 5; NS chapters 0 & 1; slides (pdf)

2

Thu., Sep. 5

Background

NI chapter 6, NS chapter 2; slides (pdf)

3

Tue., Sep. 10

Patterns in Networks

NI chapter 10, NS chapter 4; slides (pdf)

4

Thu., Sep. 12

Network Models

First assignment out

Revised version with tips

NI chapters 11-13, NS chapter 3 & 5; slides (pdf) 

5

Tue., Sep. 17

Measures: Ranking, Centrality, Correlations

NI chapters 7-9, NS chapter 7; slides (pdf)

6

Thu., Sep. 19

Advanced topics on Patterns in Networks (5 papers)

[Andy] Broido AD, Clauset A. Scale-free networks are rare. Nature communications. 2019 Mar 4;10(1):1017.

[Charlotte] Cantwell GT, Newman ME. Mixing patterns and individual differences in networks. Physical Review E. 2019 Apr 16;99(4):042306.

[Aayushi] Leskovec J, Kleinberg J, Faloutsos C. Graph evolution: Densification and shrinking diameters. ACM Transactions on Knowledge Discovery from Data (TKDD). 2007 Mar 1;1(1):2.

[Aarash] Peel L, Delvenne JC, Lambiotte R. Multiscale mixing patterns in networks. Proceedings of the National Academy of Sciences. 2018 Apr 17;115(16):4057-62.

[Owen] Willinger W, Alderson D, Doyle JC. Mathematics and the internet: A source of enormous confusion and great potential. Notices of the American Mathematical Society. 2009;56(5):586-99.

7

Tue., Sep. 24

Clustering and Community Detection basics

First assignment due

Second assignment out

NI chapter 14, NS chapter 9; slides (pdf)

8

Thu., Sep. 26

Advanced topics on Network Models (5 papers)

[Raika] Leskovec J, Chakrabarti D, Kleinberg J, Faloutsos C, Ghahramani Z. Kronecker graphs: An approach to modeling networks. Journal of Machine Learning Research. 2010;11(Feb):985-1042.

[Fozail] Kumar R, Raghavan P, Rajagopalan S, Sivakumar D, Tomkins A, Upfal E. Stochastic models for the web graph. InProceedings 41st Annual Symposium on Foundations of Computer Science 2000 Nov 12 (pp. 57-65). IEEE.

[Deeksha] Pfeiffer III JJ, Moreno S, La Fond T, Neville J, Gallagher B. Attributed graph models: Modeling network structure with correlated attributes. InProceedings of the 23rd international conference on World wide web 2014 Apr 7 (pp. 831-842). ACM.

[Dylan] Robles P, Moreno S, Neville J. Sampling of attributed networks from hierarchical generative models. InProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2016 Aug 13 (pp. 1155-1164). ACM.

[Abby] Chakrabarti D, Zhan Y, Faloutsos C. R-MAT: A recursive model for graph mining. InProceedings of the 2004 SIAM International Conference on Data Mining 2004 Apr 22 (pp. 442-446). Society for Industrial and Applied Mathematics.

9

Tue., Oct. 1

Anomaly Detection

Akoglu L, Tong H, Koutra D. Graph based anomaly detection and description: a survey. Data mining and knowledge discovery. 2015 May 1;29(3):626-88.

Ranshous S, Shen S, Koutra D, Harenberg S, Faloutsos C, Samatova NF. Anomaly detection in dynamic networks: a survey. Wiley Interdisciplinary Reviews: Computational Statistics. 2015 May;7(3):223-47.

Slides (pdf)

10

Thu., Oct. 3

Advanced topics on Measures (5 papers)

[Junlin] Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U. Network motifs: simple building blocks of complex networks. Science. 2002 Oct 25;298(5594):824-7.

[Hao] McPherson M, Smith-Lovin L, Cook JM. Birds of a feather: Homophily in social networks. Annual review of sociology. 2001 Aug;27(1):415-44.

[Liheng] Jung J, Shin K, Sael L, Kang U. Random walk with restart on large graphs using block elimination. ACM Transactions on Database Systems (TODS). 2016 Jun 30;41(2):12.

[David] Tong H, Faloutsos C, Pan JY. Fast random walk with restart and its applications. InSixth International Conference on Data Mining (ICDM'06) 2006 Dec 18 (pp. 613-622). IEEE. ↓

Sun, Oct. 6

Second assignment due

11

Tue., Oct. 8

Link & attribute Prediction

Third assignment out

Slides (pdf)

12

Thu., Oct. 10

Advanced topics on Community Detection (5 papers)

[David] Tong H, Faloutsos C, Pan JY. Fast random walk with restart and its applications. InSixth International Conference on Data Mining (ICDM'06) 2006 Dec 18 (pp. 613-622). IEEE. ↑

[Yuecai] L. Peel, D. B. Larremore, & A. Clauset, "The ground truth about metadata and community detection in networks." Science Advances 3(5), e1602548 (2017).

[Deeksha] Ahn YY, Bagrow JP, Lehmann S. Link communities reveal multiscale complexity in networks. nature. 2010 Aug;466(7307):761.

[Samy] Leskovec J, Lang KJ, Dasgupta A, Mahoney MW. Statistical properties of community structure in large social and information networks. InProceedings of the 17th international conference on World Wide Web 2008 Apr 21 (pp. 695-704). ACM.

[Paul] Ugander J, Backstrom L. Balanced label propagation for partitioning massive graphs. InProceedings of the sixth ACM international conference on Web search and data mining 2013 Feb 4 (pp. 507-516). ACM.

13

Tue., Oct. 15

Class cancelled

14

Thu., Oct. 17

Advanced topics on Anomaly Detection (5 papers)

[Andy ]Hou S, Ye Y, Song Y, Abdulhayoglu M. Hindroid: An intelligent android malware detection system based on structured heterogeneous information network. InProceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2017 Aug 13 (pp. 1507-1515). ACM.

[Akshal] Perozzi B, Akoglu L, Iglesias Sánchez P, Müller E. Focused clustering and outlier detection in large attributed graphs. InProceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining 2014 Aug 24 (pp. 1346-1355). ACM.

[Samy] Hooi B, Song HA, Beutel A, Shah N, Shin K, Faloutsos C. Fraudar: Bounding graph fraud in the face of camouflage. InProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2016 Aug 13 (pp. 895-904). ACM.

[Matthew] Sun J, Faloutsos C, Faloutsos C, Papadimitriou S, Yu PS. Graphscope: parameter-free mining of large time-evolving graphs. InProceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining 2007 Aug 12 (pp. 687-696). ACM.

Oct 18

Third assignment due

15

Tue., Oct. 22

Advanced topics on Link Prediction (4 papers)

[Fozail] Leskovec J, Huttenlocher D, Kleinberg J. Predicting positive and negative links in online social networks. InProceedings of the 19th international conference on World wide web 2010 Apr 26 (pp. 641-650). ACM.

[Charlotte] Zhang M, Chen Y. Weisfeiler-Lehman neural machine for link prediction. InProceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2017 Aug 13 (pp. 575-583). ACM.

[Hao] Cao Z, Wang L, De Melo G. Link prediction via subgraph embedding-based convex matrix completion. In Thirty-Second AAAI Conference on Artificial Intelligence 2018 Apr 29.

[Yasmeen] Benson A, Kleinberg J. Link Prediction in Networks with Core-Fringe Data. In The World Wide Web Conference 2019 May 13 (pp. 94-104). ACM.

16

Thu., Oct. 24

Advanced topics on Attribute Prediction (4 papers)

[Abby] Zhou D, Hofmann T, Schölkopf B. Semi-supervised learning on directed graphs. InAdvances in neural information processing systems 2005 (pp. 1633-1640).

[Aarash] Koutra D, Ke TY, Kang U, Chau DH, Pao HK, Faloutsos C. Unifying guilt-by-association approaches: Theorems and fast algorithms. InJoint European Conference on Machine Learning and Knowledge Discovery in Databases 2011 Sep 4 (pp. 245-260). Springer, Berlin, Heidelberg.

[Albert] Gatterbauer W, Günnemann S, Koutra D, Faloutsos C. Linearized and single-pass belief propagation. Proceedings of the VLDB Endowment. 2015 Jan 1;8(5):581-92.

[Liheng] Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907. 2016 Sep 9.

Oct 28

Proposal slides due

17

Tue., Oct. 29

Project proposal presentations (12*5 minutes each)

Last names: A-K

18

Thu., Oct. 31

Project proposal presentations (12*5 minutes each)

Last names: L-Z

Nov. 1

Proposal due

19

Tue., Nov. 5

Dynamics, Evolution Patterns & Diffusion Models

NI chapter 16-17, NS chapter 6, 8 & 10 Slides (pdf)

20

Thu., Nov. 7

Advanced topics on Diffusion (4 papers)      

[Yasmeen] Romero DM, Meeder B, Kleinberg J. Differences in the mechanics of information diffusion across topics: idioms, political hashtags, and complex contagion on twitter. InProceedings of the 20th international conference on World wide web 2011 Mar 28 (pp. 695-704). ACM.

[Akshal] Brockmann D, Helbing D. The hidden geometry of complex, network-driven contagion phenomena. science. 013 Dec 13;342(6164):1337-42.

[Dylan] Dodds PS. Slightly generalized contagion: Unifying simple models of biological and social spreading. InComplex Spreading Phenomena in Social Systems 2018 (pp. 67-80). Springer, Cham.

[Safa] Cheng J, Adamic L, Dow PA, Kleinberg JM, Leskovec J. Can cascades be predicted?. InProceedings of the 23rd international conference on World wide web 2014 Apr 7 (pp. 925-936). ACM.

21

Tue., Nov. 12

Heterogenous and Rich Networks

Slides (pdf)

22

Thu., Nov. 14

Advanced topics o  Dynamics (4 papers)      

[Julien] Kempe D, Kleinberg J, Tardos É. Maximizing the spread of influence through a social network. InProceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining 2003 Aug 24 (pp. 137-146). ACM.

[Matthew] Safavi T, Davoodi M, Koutra D. Career Transitions and Trajectories: A Case Study in Computing. InProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining 2018 Jul 19 (pp. 675-684). ACM.

[David] Eswaran D, Faloutsos C, Guha S, Mishra N. Spotlight: Detecting anomalies in streaming graphs. InProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining 2018. ACM.

[Paul] Chang S, Zhang Y, Tang J, Yin D, Chang Y, Hasegawa-Johnson MA, Huang TS. Streaming recommender systems. InProceedings of the 26th International Conference on World Wide Web 2017 Apr 3 (pp. 381-389).

Evaluation

Nov 18

Progress report due

23

Tue., Nov. 19

Project report presentations (12*5 minutes each)

Last names: L-Z

24

Thu., Nov. 21

Project report presentations (12*5 minutes each)

Last names: A-K

25

Tue., Nov. 26

Advanced topics on Rich Networks (4 papers)

Reviews due (first round)

[Safa] M. E. J. Newman and A. Clauset, "Structure and inference in annotated networks." Nature Communications 7, 11863 (2016).

[Komal] Dong Y, Chawla NV, Swami A. metapath2vec: Scalable representation learning for heterogeneous networks. InProceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining 2017 Aug 13 (pp. 135-144). ACM.

[Yuecai] Sun Y, Han J, Yan X, Yu PS, Wu T. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks. Proceedings of the VLDB Endowment. 2011 Aug 1;4(11):992-1003.

[Julien] Eswaran D, Günnemann S, Faloutsos C, Makhija D, Kumar M. Zoobp: Belief propagation for heterogeneous networks. Proceedings of the VLDB Endowment. 2017 Jan 1;10(5):625-36.

Evaluation

27

Tue., Nov. 28

Summarization, Visualization and Layouts

Slides (pdf)

Liu Y, Safavi T, Dighe A, Koutra D. Graph summarization methods and applications: A survey. ACM Computing Surveys (CSUR). 2018 Jul 16;51(3):62.

26

Thu., Dec. 3

Advanced topics on Summarization (4 papers)

[Julin]  Koutra D, Jin D, Ning Y, Faloutsos C. Perseus: an interactive large-scale graph mining and visualization tool. Proceedings of the VLDB Endowment. 2015 Aug 1;8(12):1924-7.

[Raika] Di Jin, Ryan A. Rossi, Eunyee Koh, Sungchul Kim, Anup Rao, Danai Koutra. Latent Network Summarization: Bridging Network Embedding and Summarization. KDD, 2019

[Albert] Navlakha S. Learning the structural vocabulary of a network. Neural computation. 2017 Feb;29(2):287-312.

[Komal] Feldman D, Ozer S, Rus D. Coresets for vector summarization with applications to network graphs. InProceedings of the 34th International Conference on Machine Learning-Volume 70 2017 Aug 6 (pp. 1117-1125). JMLR. org.

[Aayushi] Amiri SE, Chen L, Prakash BA. Efficiently summarizing attributed diffusion networks. Data Mining and Knowledge Discovery. 2018 Sep 1;32(5):1251-74.

Evaluation 

Dec 6 Dec 8

Project submission due

Dec 11 Dec 12

Reviews due (second round)

Dec 18 Dec 22

Final project submission deadline

Please keep checking this syllabus as it will be updated.

>> Lectures are not going to be recorded but slides will be posted here.

Guidelines for Project

What should be in the writeups? use Standard ACM Conference Proceedings Template (For LaTeX users: unzip acmart.zip, make, and use sample-sigconf.tex as a template). Have the following headings/components in each:

Guidelines for Presentations:

Assignments:

List of projects:

Course Structure

Each week we discuss a topic in network science. I will cover the fundamental and classic concepts related to the topic of the week and provide a list of five papers for further reading on the most recent works related to the discussed topic. In a subsequent session, five students present those readings in 15-minute time slots, 10-minute presentations, 5-minute discussion (similar to conference presentations). When discussing these state of the art papers, we brainstorm over the open problems related to the topic, and possible directions to follow up on these recent works. This could become a topic that a student could choose to delve deeper into for his/her project (we will also have a list of topics from the start of the course to choose from). Projects are led by one student, and collaborations are encouraged, but you will be only graded based on the project you lead (bonus points are the exception and will be given to all collaborators, see grading details). Early in the course, we will also have 3 sets of hands-on exercise problems (10% each) to implement basic algorithms and get accustomed to working with networked data. For the project, you will deliver a 5-minute pitch on your project topic, submit a 2 pages project proposal mid-term, 4-5 pages project progress report a month later and a final project report at the end of the term. The proposal covers the main related works, problem definition, datasets, and experiment setup. You will present the proposals in 5 minutes of presentation (2-4 slides) to get quick feedback. The project progress report is 4-5 pages and adds the preliminary results. The final report is expected as an 8 page write up and has the final results, which will be reviewed by two/three of your peers and me. Based on the reviews, you will have one week to write a 500 word rebuttal and resubmit an improved final version. The final version will be graded by me. All submissions for the project are in a conference proceeding format (you are strongly encouraged to use latex). This setup closely mimics a research project lifecycle, from forming the idea to publish the results, and is a good exercise to introduce you to research.

Grading

  • 20% presentations of assigned readings (averaged over two/three presentations, the number determined by the number of students in the class, the expectation for each delivery is that you are able to answer questions asked about the paper as if it is your own work)
  • 30% assignments (3x10%)
  • 50% project (10% proposal, 10% progress report, 30% final report)
  • bonus points:
  • 5% for the best class presentation
  • 5% for the best project proposal
  • 5% for the three best reviewers in each round
  • 10% bonus point for the three best projects
  • 2% for an interesting point you share with the class from the related readings which was not covered in the class, could be from previous sessions

Bonus points tracking

Topic

Who & what was shared

1

Tue., Sep. 3

Introduction

  • Andy for sharing his readings on preferential attachment paper

2

Thu., Sep. 5

Background

  • Charlotte, finding an error in the slides (now fixed, double check slide 47)

3

Tue., Sep. 10

Patterns in Networks

  • Charlotte for sharing the relation between average degree & exponent of power laws
  • Andy for sharing metcalfe's law (value of the network) & how it is not observed in sparse graphs

4

Thu., Sep. 12

Network Models

  • Dylan for sharing how to confirm preferential attachment in real world networks
  • Charlotte for discussing how the deg. dist. of AB model is independent of average degree
  • Akshal for discussing how phase transition in Physics is linked to the critical point of ER graphs
  • Aarash for sharing the Linearized Chord Diagram for BA models
  • Andy for discussing the ultra small world property and how average shortest paths changes with exponent of the AB model

5

Tue., Sep. 17

Measures: Ranking, Centrality, Correlations

  • Akshal for explaining the Price model which results in scale free networks
  • Raika for discussing how we can efficiently generate scale free networks by using the Price model
  • Deeksha for discussing the friendship paradox
  • Aarash for discussing how to improve the assortativity by edge reviewing and  Xalvi-Brunet & Sokolov
  • David for explaining that  Xalvi-Brunet & Sokolov algorithm has a problem with large networks
  • [remind me your name] vertex cover is np-complete and minimum vertex cover is np-hard

6

Tue., Sep. 24

Clustering and Community Detection basics

  • Aarash for discussing the infomap algorithm
  • Charlotte for discussing the hierarchical network model to generate scale free networks with nested communities
  • Raika for discussing the random walk with restart and its relation to pagerank

7

Tue., Oct. 1

Anomaly Detection

  • Andy for discussing the louvain method
  • Samy for talking about the LFR approach
  • Akshal for talking about iterative community and outlier detection

8

Tue., Oct. 8

Link & attribute Prediction

  • Paul for talking about the GraphRNN model and how it is able to generate realistic graphs
  • Akshal for discussing anomaly detection through compression techniques
  • Arash for discussing the curse of dimensionality and the need to reduce dimensions e.g. by subsetting the features

9

Tue., Nov. 5

Dynamics, Evolution Patterns & Diffusion Models

  • Albert for talking about the obesity and friendship
  • Aarash for giving nice summary of percolation theory and robustness of scale free network
  • Liheng for sharing Relational Representation Learning for Dynamic (Knowledge) Graphs: A Survey
  • Paul for discussing DyRep: Learning Representations over Dynamic Graphs

10

Tue., Nov. 12

Heterogenous and Rich Networks

  • Junlin, for discussing robustness in scale-free networks, attack tolerance  

11

Thu., Dec. 3

Summarization, Visualization and Layouts

→ This will be updated after each class. Let me know if your name is missing, nothing is supposed to be filtered out.

Late submissions

All due dates are 11:59 pm in Montreal. Oct 28th, Nov 18th and Dec 6th deadlines are firm and no extension is possible. For all other submissions, 2^k% of the grade will be deducted per k days of delay.  For presentations, if you are going to miss your assigned class presentation, you need to arrange for another student to switch with you. If you can not find someone to cover for you, you need to contact me so that we can find other arrangements. Other than special situations, you most likely will lose the grade.

Plagiarism

Except for the related work section of your project reports, all other parts should be 100% your own deductions, implementation, findings, and results. For the related work section, you may rephrase and summarize the findings of relevant prior work, and cite the corresponding paper. It is not acceptable to copy anything or reuse any wordings, other than technical terms, with appropriate citations. Plagiarism, if detected, not only will result in zero in your grade, but also a report to the university.

Resources:

Relevant Courses:

Datasets: