Parallel System Architecture and Programming

Language of Instruction:Czech
Completion:credit+exam (written)
Type of
Guarantor:Dvořák Václav, prof. Ing., DrSc. (DCSY)
Lecturer:Dvořák Václav, prof. Ing., DrSc. (DCSY)
Instructor:Dvořák Václav, prof. Ing., DrSc. (DCSY)
Jaroš Jiří, doc. Ing., Ph.D. (DCSY)
Kašpárek Tomáš, Ing. (CC)
Faculty:Faculty of Information Technology BUT
Department:Department of Computer Systems FIT BUT
Substitute for:
Advanced Computer Architecture (ARP), DCSY
Practical Parallel Programming (PPP), DCSY
Learning objectives:
  To orientate oneself in parallel systems on the market, be able to assess communication and computing possibilities of a particular architecture and to predict the performance of parallel applications. To get acquainted with the most important parallel programming tools (MPI, OpenMP), to learn their practical use and solving problems in parallel.
  The course covers architecture and programming of parallel systems with functional- and data-parallelism. First the parallel system theory and program parallelization are discussed. The description of the most proliferated multi-core and multi-processor symmetrical multiprocessors (SMP) follows and their programming in OpenMP environment is dealt with. The course goes on  with the treatment of interconnection networks, a basic structure for popular networks of workstations and other message-passing systems. Their programming in standardized interfaces MPI and PVM is illustrated with case studies of parallel applications. In conclusion the advanced DSM NUMA systems are described. 
Knowledge and skills required for the course:
  Von-Neumann computer architecture, computer memory hierarchy, cache memories and their organization, programming in assembly and in C/C++.
Subject specific learning outcomes and competencies:
  Overview of principles of parallel system design and of interconnection networks, communication techniques and algorithms. Survey of parallelization techniques of fundamental scientific problems, knowledge of parallel programming in MPI and OpenMP.
Generic learning outcomes and competencies:
  Knowledge of capabilities and limitations of parallel processing, ability to estimate performance of parallel applications. Language means for process/thread communication and synchronization. Competence in hardware-software platforms for high-performance computing and simulations.
Syllabus of lectures:
  • Function- and data-parallelism, performance measures, overhead, speedup-limiting laws.
  • Program parallelization, decomposition, task scheduling.
  • Shared memory multiprocessors. Bus saturation, crossbar, arbiters, memory organization.
  • Cache coherence, MSI and MESI protocols. Memory consistency models. 
  • OpenMP, loop parallelization.
  • Synchronization in OpenMP, locks and barriers.
  • Performance oriented parallel programming.
  • Interconnection and switching networks, routing algorithms.
  • Flow control, router architecture.
  • Messaging, collective communications and communication performance. 
  • Message-passing programming (MPI). 
  • Cluster computing with pairwise and group communication.  
  • Distributed (shared) memory NUMA architectures. 
Syllabus of numerical exercises:
 Tutorials are not scheduled for this course.
Syllabus - others, projects and individual work of students:
  • n - body(particle) problem, performance prediction on a cluster. 
  • Evolution of a thermal field by means of Jacobi iterations  on a SMP in OpenMP.
  • Evolution algorithms on a blade center.
Fundamental literature:
  • Culler, D.E.: Parallel Computer Architecture - A Hardware / Software Approach. Morgan Kaufmann Publ., 1999, 1025 p., ISBN 1-55860-343-3.
  • Quinn, M.J: Parallel Programming in C with MPI and OpenMP. McGraw Hill, 2004, 529 p., ISBN: 0072822562.
  • Dally, W.J., Towles, B.: Principles and Practices of Interconnection Networks. Morgan Kaufman Publ., 2004, 550 p., ISBN:0-12-200751-4.
Study literature:
  • Hennessy, J.L., Patterson, D.A.: Computer Architecture - A Quantitative Approach. 4. vydání, Morgan Kaufman Publishers, Inc., 2007, 1136 p., ISBN 1-55860-596-7.  
Progress assessment:
  Three small projects in duration of 5, 4 a 4 hours ; midterm examination.
Exam prerequisites:
  To complete successfuly session work and be able to write examination, one has to get at leat 20 points out of maximum 40.

Your IPv4 address:
Switch to https

DNSSEC [dnssec]