Parallel System Architecture and Programming

Language of Instruction:Czech
Completion:examination (written)
Type of
Hour/semLecturesSem. ExercisesLab. exercisesComp. exercisesOther
Guarantor:Dvořák Václav, prof. Ing., DrSc., DCSY
Lecturer:Dvořák Václav, prof. Ing., DrSc., DCSY
Instructor:Dvořák Václav, prof. Ing., DrSc., DCSY
Kašpárek Tomáš, Ing., CC
Ohlídal Miloš, Ing., DCSY
Faculty:Faculty of Information Technology BUT
Department:Department of Computer Systems FIT BUT
Substitute for:
Advanced Computer Architecture (ARP), DCSY
Practical Parallel Programming (PPP), DCSY
Learning objectives:
  To orientate oneself in parallel systems on the market, be able to assess communication and computing possibilities of a particular architecture and to predict the performance of parallel applications. To get acquainted with the most important parallel programming tools (MPI, OpenMP), to learn their practical use and solving problems in parallel.
  The course covers architecture and programming of parallel systems with functional- and data-parallelism. First the most proliferated symmetrical multiprocessors with shared bus (SMP) and their programming in OpenMP environment are dealt with. Then follows the treatment of interconnection networks, a basic structure for popular networks of workstations and other message-passing systems. Their programming in standardized interface MPI is illustrated with case studies of parallel applications. Data-parallel systems and programming are in the last group.
Knowledge and skills required for the course:
  Von-Neumann computer architecture, computer memory hierarchy, cache memories and their organization, programming in assembly and in C/C++
Subject specific learning outcomes and competences:
  Overview of principles of parallel system design and of interconnection networks, communication techniques and algorithms. Survey of parallelization techniques of fundamental scientific problems, knowledge of parallel programming in MPI and OpenMP.
Generic learning outcomes and competences:
  Knowledge of capabilities and limitations of parallel processing, ability to estimate performance of parallel applications. Language means for process/thread communication and synchronization. Competence in hardware-software platforms for high-performance computing and simulations.
Syllabus of lectures:
  • Function- and data-parallelism, performance measures, overhead, speedup-limiting laws.
  • Bus-based shared memory multiprocessors. Bus saturation, memory organization.
  • Cache coherence, MSI and MESI protocols.
  • OpenMP, loop parallelization.
  • Synchronization in OpenMP, locks and barriers.
  • Performance of parallel applications.
  • Interconnection and switching networks, routing algorithms.
  • Flow control, router architecture.
  • Messaging, group communications and communication performance.
  • Message-passing programming (MPI).
  • Cluster computing with point-to-point and group communications.
  • DSM multiprocessors. 
  • Data-parallel systems and programming, HPF.
Syllabus of numerical exercises:
  • Model of SMP with coherent caches, data prefetch, false sharing, FFT parallelization.
  • Parallel sort on SMP.
  • Mid-term examination.
  • Group communications, numerical methods in MPI.
  • Parallelization of image processing tasks.
Syllabus - others, projects and individual work of students:
  • Parallel Fast Fourier Transform (FFT) on a symmetric multiprocessor in OpenMP.
  • Parallel bitonic sort - performance prediction for a given topology.
  • Parallel solution of a large system of linear equations or Matrix Multiply on a cluster of workstations or PCs, MPI.
  • Discrete optimization or Data minining - cluster of workstations.
Fundamental literature:
  • Culler, D.E.: Parallel Computer Architecture - A Hardware / Software Approach. Morgan Kaufmann Publ., 1999, 1025 p., ISBN 1-55860-343-3.
  • Quinn, M.J: Parallel Programming in C with MPI and OpenMP. McGraw Hill, 2004, 529 p., ISBN: 0072822562.
  • Dally, W.J., Towles, B.: Principles and Practices of Interconnection Networks. Morgan Kaufman Publ., 2004, 550 p., ISBN:0-12-200751-4.
Study literature:
  • Hennessy, J.L., Patterson, D.A.: Computer Architecture - A Quantitative Approach. 3. vydání, Morgan Kaufman Publishers, Inc., 2003, 1136 p., ISBN 1-55860-596-7.  
Progress assessment:
  Four small projects, 2 hours each; midterm examination.
Exam prerequisites:
  Obtaining 20 out of maximum 40 points during the term.