Title:

Parallel Computations on GPU

Code:PCG
Ac.Year:2019/2020
Sem:Winter
Curriculums:
ProgrammeField/
Specialization
YearDuty
MITAINADE-Elective
MITAINBIO-Elective
MITAINCPS-Elective
MITAINEMB-Elective
MITAINGRI-Elective
MITAINHPC-Compulsory
MITAINIDE-Elective
MITAINISD-Elective
MITAINISY-Elective
MITAINMAL-Elective
MITAINMAT-Elective
MITAINNET-Elective
MITAINSEC-Elective
MITAINSEN-Elective
MITAINSPE-Elective
MITAINVER-Elective
MITAINVIZ-Elective
Language of Instruction:Czech
Credits:5
Completion:examination (written)
Type of
instruction:
Hour/semLecturesSeminar
Exercises
Laboratory
Exercises
Computer
Exercises
Other
Hours:26001214
 ExamsTestsExercisesLaboratoriesOther
Points:60150025
Guarantor:Jaroš Jiří, doc. Ing., Ph.D. (DCSY)
Deputy guarantor:Vašíček Zdeněk, doc. Ing., Ph.D. (DCSY)
Lecturer:Jaroš Jiří, doc. Ing., Ph.D. (DCSY)
Instructor:Kadlubiak Kristián, Ing. (DCSY)
Faculty:Faculty of Information Technology BUT
Department:Department of Computer Systems FIT BUT
Schedule:
DayLessonWeekRoomStartEndLect.Gr.Groups
MonlecturelecturesG202 16:0017:501MIT 2MIT xx
Wedcomp.lablecturesO204 08:0009:501MIT 2MIT xx
Wedcomp.lablecturesO204 10:0011:501MIT 2MIT xx
Wedcomp.lablecturesO204 12:0013:501MIT 2MIT xx
 
Learning objectives:
  To familiarize yourself with the architecture and programming of graphics processing unit in the area of general purpose computuing using the NVidia libraries and OpenACC standard. To learn how to design and implement accelerated programs exploiting the potential of GPUs. To gain knowledge about the available libraries for programming on GPUs.
Description:
  The course covers the architecture and programming of graphics processing units by the NVidia and partially AMD. First, the architecture of GPUs is studied in detail. Then, the model of the program execution using hierarchical thread organisation and the SIMT model is discussed. Next, the memory hierarchy and synchronization techniques are described. After that, the course explains novel techniques of dynamic parallelism and data-flow processing concluded by practical usage of multi-GPU systems in environments with shared (NVLink) and distributed (MPI) memory. The second part of the course is devoted to high level programming techniques and libraries based on the OpenACC technology.
Knowledge and skills required for the course:
  Knowledge gained in courses AVS and partially in PRL and PPP.
Subject specific learning outcomes and competencies:
  Knowledge of the parallel programming on GPUs in the area of general purpose computing, orientation in the area of accelerated systems, libraries and tools.  
Generic learning outcomes and competencies:
  Understanding of hardware limitations having impact on the efficiency of software solutions. 
Why is the course taught:
  The future of computation systems ranging from ordinary PC up to top supercomputers is seen in heterogeneous systems where the sequential parts and the logic is processed by CPUs while the computational parts are offload to accelerators, in this case GPUs. This course will teach you the architecture and software libraries for programming graphics processing units in the area of general purpose computations. 
Syllabus of lectures:
 
  1. Architecture of graphics processing units.
  2. CUDA programming model, tread execution.
  3. CUDA memory hierarchy.
  4. Synchronization and reduction.
  5. Dynamic parallelism and unified memory.
  6. Design and optimization of GPU algorithms.
  7. Stream processing, computation-communication overlapping.
  8. Multi-GPU systems.
  9. Nvidia Thrust library.
  10. OpenACC basics.
  11. OpenACC memory management.
  12. Code optimization with OpenACC.
  13. Libraries and tools for GPU programming.
Syllabus of computer exercises:
 
  1. CUDA: Memory transfers, simple kernels
  2. CUDA: Shared memory
  3. CUDA: Texture and constant memory
  4. CUDA: Dynamic parallelism and unified memory.
  5. OpenACC: basic techniques.
  6. OpenACC: advanced techniques.
Syllabus - others, projects and individual work of students:
 
  • Development of an application in Nvidia CUDA
  • Development of an application in OpenACC
Fundamental literature:
 
  • Kirk, D., and Hwu, W.: Programming Massively Parallel Processors: A Hands-on Approach, Elsevier, 2010, s. 256, ISBN: 978-0-12-381472-2
  • Sanders, J., & Kandrot, E: CUDA by Example: An Introduction to General-Purpose GPU Programming. Review Literature And Arts Of The Americas. Addison-Wesley, 2010.
  • Storti,D., and Yurtoglu, M.: CUDA for Engineers: An Introduction to High-Performance Parallel Computing, Addison-Wesley Professional; 1 edition, 2015. ISBN 978-0134177410.
  • Chandrasekaran, S., and Juckeland, G.: OpenACC for Programmers: Concepts and Strategies,  Addison-Wesley Professional, 2017, ISBN 978-0134694283
Study literature:
 
Controlled instruction:
  
  • Missed labs can be substituted in alternative dates.
  • There will be a place for missed labs in the last week of the semester.
Progress assessment:
  Assessment of two projects, 14 hours in total and, computer laboratories and a midterm examination.
Exam prerequisites:
  To get 20 out of 40 points for projects and midterm examination.
 

Your IPv4 address: 34.201.121.213