Exercises and examples of Chapter 3 in P. Arbenz and W. Petersen,
Introduction to Parallel Computing, Oxford Univ. Press, 2004.

EXERCISES (Uebungen):

 Exercise 3.1 (Uebung 3.1): variants of matrix-matrix multiply. Exercise 3.2 (Uebung 3.2): SSUM and SAXPY implemented using Intel SSE intrinsics. See BLAS Level-1 routines from BLAS. Exercise 3.1 (Uebung 3.1) solution: variants of matrix-matrix multiply problem. Exercise 3.2 (Uebung 3.2) solution. SSUM and SAXPY implemented using Intel SSE intrinsics. See BLAS Level-1 routines from BLAS .

TEST PROGRAMS/routines:

 Altivec FFT in-line: binary radix FFT using workspace and Apple Altivec intrinsics. This version expands step in-line. Otherwise, it is similar to "genericfft.c" below (Section 3.6). Altivec dot product: unit stride sdot for Apple Altivec (Section 3.5.5). See BLAS Level-1 routines from BLAS. Altivec isamax: unit stride isamax0 for Apple Altivec (Section 3.5.7). See BLAS Level-1 routines from BLAS. Altivec FFT: binary radix FFT using binary radix FFT using a workspace and Apple Altivec intrinsics. It is similar to "genericfft.c" below (Section 3.6). Generic FFT: generic binary radix FFT using a workspace but no SSE or Altivec intrinsics (Section 3.6). Multiple tridiagonal: sub-procedure for multiple right hand side solution of tridiagonal systems via simple one-step recurrence formula - after Forsythe and Moler (see Section 3.5.2). SGEFA: tests variants of the simple parallel version of sgefa in (Section 3.4.2). There is a README file in this gzipped tar file describing the variants. Rpoly: recursive doubling version of polynomial evaluation (Section 3.5). SSE FFT in-line: version of workspace version of binary radix FFT with step in-lined - same as genericfft.c above but using Intel SSE intrinsics. (from Section 3.6). SSE isamax: SSE example of isamax0, a unit stride isamax (from Section 3.5.7). See BLAS Level-1 routines from BLAS . SSE FFT: SSE version of workspace version of binary radix FFT, same as genericfft.c above but using Intel SSE intrinsics (see Section 3.6). Tridiagonal system tests: Tridiagonal system solver tests, also compares timings for the simple recurrence method (from Forsythe and Moler's book) (see Section 3.5.2).