| 
 home publications teaching short CV personal the pub | Algorithms and Computation in Signal Processing orHow to Write Fast Code18-799B (CMU, ECE)A new version of this course was taught in spring 2008 (website) Basic Information
											Course number: 18-799B
											
											
											
											
											
											
											
											
											Spring 2005, TR: 1:30--3:00pm, DH 1209
											
											
											
											
											
											
											
											
											Instructor: Markus Püschel (PH B16, pueschel at ece, 8-4259), TA: Srinivas Chellappa (PH B10, schellap at andrew, 8-7104), Admin: Carol Patterson (PH B15, carol at ece, 8-7286)
											
											
											
											
											
											
											
											
											12 units
											
											
											
											
											
											
											
											
											Office hours: Tuesdays 4-5pm, Thursdays after class til 4pm.
											
											
											
											
											
											
											
											
											Requirements: solid C programming skills, senior undergraduate or graduate student
										
										
										
										
										
										
										
										
										 Course Description
The fast evolution and increasing complexity of computing platforms pose a major challenge for developers of high quality software for signal processing and other applications: it becomes increasingly harder to harness the available computing power; conversely, straightforward implementations may loose as much as one or two orders of magnitude in performance. Creating optimal implementations requires the developer to have an understanding of algorithms, capabilities and limitations of compilers, and the target platform's microarchitecture. For these reasons, a recent trend in numerical computing is towards "self-adaptable" software to achieve optimal performance and portability at reduced coding effort. This interdisciplinary course introduces the student to the foundations and state-of-the-art techniques in high performance software development for signal processing and other numerical functionality including transforms, filters, and basic linear algebra algorithms. Topics include: 1) fundamental tools in algorithm theory and analysis; 2) fast signal processing and numerical algorithms; 3) how to write software that overcomes compiler limitations; 4) the role of the memory hierarchy and other microarchitectural features in software development; 5) how to use special instruction sets, such as SSE/MMX on Pentium; 6) an introduction to the concepts of self-adaptable software and software generators. The course is targeted to signal processing students and CE/CS students who want to acquire a better understanding of algorithms and advanced software implementation techniques. The students will be required to complete several implementation projects in C throughout the semester. Topics Covered
											Foundations of algorithm analysis
												
													cost and complexity
													
													
													
													
													
													
													
													
													O-calculus
													
													
													
													
													
													
													
													
													cost analysis and recurrences
												
												
												
												
												
												
												
												
												Computer architecture
												
													architecture and microarchitecture
													
													
													
													
													
													
													
													
													memory hierarchy including cache structure
													
													
													
													
													
													
													
													
													execution units
													
													
													
													
													
													
													
													
													special instruction sets (especially short vector instructions)
												
												
												
												
												
												
												
												
												Compilers (strengths, limitations, how to use)
											
											
											
											
											
											
											
											
											In detail: algorithms, complexity, and cutting-edge software or code generation for various numerical kernels
												
													Discrete Fourier transform, filters, other signal transforms (FFTW and SPIRAL)
													
													
													
													
													
													
													
													
													Linear algebra kernels (ATLAS)
													
													
													
													
													
													
													
													
													Higher linear algebra functionality (LAPACK)
													
													
													
													
													
													
													
													
													Sparse linear algebra (BeBOP)
													
													
													
													
													
													
													
													
													Other kernels
												
												
												
												
												
												
												
												
												 Goals of this Course
											Understand the connection between algorithms, implementation, and computer architecture
											
											
											
											
											
											
											
											
											Learn a guideline how to write fast numerical code and apply it in your research project
											Learn some fundamental numerical algorithms
											
											
											
											
											
											
											
											
											Learn how to analyze numerical algorithms
										
										
										
										
										
										
										
										
										 TextbookThere is no textbook for this class. The part that is foundation (algorithms, computer architecture etc.) will be compiled from several standard books. The core part, which analyzes cutting edge implementations for numerical problems is compiled from research papers and the instructor's own experience. Grading
											50% research project
												
													Topic: Very fast, ideally adaptive implementation of (or code generation for) a numerical problem
													
													
													
													
													
													
													
													
													Team up in pairs (preferably)
													
													
													
													
													
													
													
													
													End of January/early February: Suggest to me a problem or I give you one
													
													
													
													
													
													
													
													
													Show "milestones" during semester
													
													
													
													
													
													
													
													
													Write 4 page standard conference paper (template will be provided)
													
													
													
													
													
													
													
													
													Give short presentation last week of April
												
												
												
												
												
												
												
												
												15% midterm
												
													Mostly about algorithm analysis
													
													
													
													
													
													
													
													
													Some multiple choice
												
												
												
												
												
												
												
												
												25% homework
												
													Exercises on algorithms analysis
													
													
													
													
													
													
													
													
													Implementation exercises. Purpose: study the effect of program optimizations, compilers, special instructions, etc. Tasks: writing and submitting C code & creating runtime/performance plots
													
													
													
													
													
													
													
													
													Some templates will be provided
												
												
												
												
												
												
												
												
												10% class participation
												
													It is important to attend (many things I teach cannot be found in books)
													
													
													
													
													
													
													
													
													I encourage you to ask questions
													
													
													
													
													
													
													
													
													I will provide some anonymous feedback mechanism
												
												
												
												
												
												
												
												
												 Final ExamHomework
Midterm
Thursday, 03. March, 1:30pm. Research Project
											Template for 4 page paper:
												
													Everybody reads this: conference.pdf
													For latex use: conference-latex.tgz
														
															Creating bibliography: latex conference; bibtex conference; latex conference
															
															
															
															
															
															
															
															
															Creating a pdf: dvips -t letter -o conference.ps -Ppdf -G0 conference.dvi; ps2pdf conference.ps
														
														
														
														
														
														
														
														
														For Word (discouraged) use this: conference-word.doc
												Presentation
											
											
											
											
											
											
											
											
											
												
											Timeline:
												
													first version of paper due on April 20th (contains everything except some, but not all, experimental results and optimizations)
													
													
													
													
													
													
													
													
													presentations last week of April (exact time will be decided April 12th)
													
													
													
													
													
													
													
													
													final version of paper due: one week after your presentation
												Projects:
												
													Biometrics registration and identification (Woon Ho Jung), final paper
													Software radio (Bryan Chen and Vijay Chandrasekhar), final paper
													LU factorization (John Cole), final paper
													ATLAS for embedded VLIW (Roland Wunderlich), final paper
													Shortest path problem (Sungchul Han and Sukchan Kang), final paper
													Feature set computation for biomedical imaging (Tad Merryman and Eizan Miyamoto), final paper
													Power optimization for signal transforms (Peter Milder and Marek Telgarsky), final paper
													Vector ATLAS (Joohoon Lee and Dongkeun Lee), final paper
												 Lectures (including pdfs, paper links may need CMU IP)
											1. Lecture (11. Jan.): Technicalities, overview and motivation (slides)
											
											
											
											
											
											
											
											
											2. Lecture (13. Jan.): Problem, algorithm, complexity, asymptotic runtime analysis of divide-and-conquer algorithms (slides, notes)
											
											
											
											
											
											
											
											
											3. Lecture (18. Jan.): Cost analysis, solving recurrences (slides, mynotes)
											
											
											
											
											
											
											
											
											4. Lecture (20. Jan.): Overview architecture and microarchitecture (slides, mynotes)
											
											
											
											
											
											
											
											
											5. Lecture (25. Jan.): Guide to benchmarking, overview LAPACK and BLAS, Matrix-matrix multiplication (MMM), complexity, algorithms (slides, mynotes)
											
											
											
											
											
											
											
											
											6. Lecture (27. Jan.): The ATLAS code generator: MMM (slides, mynotes, paper)
											
											
											
											
											
											
											
											
											7. Lecture (01. Feb.): Recitation (assignment 1)
											
											
											
											
											
											
											
											
											8. Lecture (03. Feb.): Model-based ATLAS (slides, mynotes, paper)
											9. Lecture (08. Feb.): Dense and sparse matrix-vector multiplication, Bebop/Sparsity (slides including link to relevant paper, mynotes)
											
											
											
											
											
											
											
											
											10. Lecture (10. Feb.): Signal transforms and algorithms: Overview (slides)
											
											
											
											
											
											
											
											
											11. Lecture (15. Feb.): Discrete Fourier transform (DFT), interpretations of the DFT, structured matrices, Kronecker product (mynotes)
											
											
											
											
											
											
											
											
											12. Lecture (17. Feb.): Structured matrices: various interpretations, Cooley-Tukey FFT (mynotes)
											
											
											
											
											
											
											
											
											13. Lecture (22. Feb.): Good-Thomas, Rader, Bluestein FFT, complexity of the DFT, complex arithmetic (mynotes)
											
											
											
											
											
											
											
											
											14. Lecture (24. Feb.): Feedback 2nd assignment, adaptive FFT library FFTW (slides, mynotes)
											
											
											
											
											
											
											
											
											15. Lecture (01. Mar.): Cancelled. One-on-one meetings.
											
											
											
											
											
											
											
											
											16. Lecture (03. Mar.): Midterm.
											17. Lecture (15. Mar.): More on FFTW, the problem with strided data access (A tensor I) (slides, mynotes)
											
											
											
											
											
											
											
											
											18. Lecture (17. Mar.): Comparison ATLAS, Sparsity, FFTW, short guide to writing fast code, convolution (filtering), correlation (slides, mynotes, thesis-filter-wavelet)
											
											
											
											
											
											
											
											
											19. Lecture (22. Mar., Srinivas Chellappa): Feedback midterm
											
											
											
											
											
											
											
											
											20. Lecture (24. Mar.): Feedback 3rd assignment (slides)
											
											
											
											
											
											
											
											
											21./22. Lecture (29./31. Mar., Franz Franchetti): SIMD vector instructions: state of the art and history; comparison to original SIMD, vector computers, VLIW and superscalar processors; programming interfaces, SSE with Intel C++ compiler; adapting algorithms to SIMD: MMM, WHT, DFT; BlueGene/L supercomputer; guide to writing fast vector code (slides1, slides2, notes)
											
											
											
											
											
											
											
											23. Lecture (05. Apr.): Cancelled. One-on-one meetings.
										
											
											
											
											
											
											
											
											
											24./25. Lecture (07./12 Apr.): Spiral: Code generation for DSP transforms (slides, mynotes)
											
											
											
											
											
											
											
											
											26. Lecture (14. Apr.): Cancelled. Spring carnival.
											
											
											
											
											
											
											
											27. Lecture (19. Apr.): Spiral demo, Gauss elimination and LU factorization (slides, mynotes)
											
											
											
											
											
											
											
											28. Lecture (21. Apr.): LU factorization (cont'd), matrix inversion, determinant, small guide to giving presentations (slides)
											
											
											
											
											
											
											
											29./31. Lecture (26./28. Apr.): Research project presentations (final project papers above)
										
										
										
										
										
										
										
										 |