Algorithms for Large-Scale Graph Processing, Seminar (ETH Zurich, Fall 2019)



Instructor: Mohsen Ghaffari
Time and Place: Wednesdays 15 - 17 at CAB 15.2
TA in Charge & Contact Point: Julian Portmann, pjulian@ethz.ch

Course Description: This is a theory seminar, where we present and discuss recent algorithmic developments for processing large-scale graphs. In particular, we focus on Massively Parallel Computation (MPC) algorithms. MPC is a clean and general theoretical framework that captures the essential aspects of computational problems in large-scale processing settings such as MapReduce, Hadoop, Spark, Dryad, etc. This model was introduced in 2010 and it is receiving increasingly more attention from the algorithmic community. During this seminar, we will present and discuss some of the fundamental papers in this growing research area. As a side remark, we note that the topics discussed in the seminar can also serve as a good starting point for master or bachelor thesis projects.


Prerequisites: The seminar has no formal prerequisite and in particular, it does not assume familiarity with parallel computation. However, we expect that all the students are comfortable with basics of algorithms design and analysis, as well as probability theory. For instance, having passed the course Algorithms, Probability, and Computing (APC) is highly recommended, though not required formally. If you're not sure whether you're ready for this class or not, please consult the instructor.


Course Catalogue Entry: see here for other information.


Tasks

Your presentation: The presentations will be in teams of two, and each team will present one of the papers in the list. The length of the presentation is 40 minutes, plus 5 minutes for questions and discussions. Also, see these guides for how to give a good technical presentation: an article by Ian Parberry, an article by Jones, Hughes, and Launchbury, and some slides by Puschel.


Questions about your paper: Prior to your presentation, you are expected to design a few questions about your paper. These questions are intended to be about high level concepts in the paper. By understanding the answers to these questions, one should be able to understand the problem that is studied and particularly, the main challenge in the problem investigated in that paper.


Participation: For each of your peers' presentation, you are expected to take a quick glance at the paper they are presenting, and answer the few questions that they designed. The main purpose of these questions is to ensure that you have had a look at the paper and you know its context and basics. This will help you in following the presentation. We will send you the questions a few days before the presentation (tentatively via a doodle link) and you should submit your (short) answers before the presentation.



List of Selected Papers

Each team should select at least three papers from the following list and indicate them in the following doodle: Papers.
Indicate a preference ordering in the comment section. You will be assigned one of these papers.
    1. Goodrich et al.: Sorting, Searching, and Simulation in the MapReduce Framework ISAAC'11
    2. Andoni et al.: Parallel Algorithms for Geometric Graph Problems STOC'14
    3. Im et al.: Efficient massively parallel methods for dynamic programming STOC'17
    4. Andoni et al.: Parallel Graph Connectivity in Log Diameter Rounds FOCS'18
    5. Liu and Vondrak: Submodular Optimization in the MapReduce Model SOSA'19
    6. Behnezhad et al.: Massively Parallel Computation of Matching and MIS in Sparse Graphs PODC'19 (based on these two papers: paper 1 and paper 2)
    7. Chang et al.: The Complexity of (Delta + 1) Coloring in Congested Clique, Massively Parallel Computation, and Centralized Local Computation PODC'19
    8. Ghaffari et al.: Conditional Hardness Results for Massively Parallel Computation from Distributed Lower Bounds FOCS'19
    9. Ghaffari et al.: Faster Algorithms for Edge Connectivity via Random 2-Out Contractions SODA'20

Other Papers: See this graduate course on Massively Parallel Algorithms for a list of other papers in this area.