| Polok, L., Smrž, P.: Fast Linear Algebra on GPU, In: IEEE conference proceedings, Liverpool, GB, IEEE CS, 2012, p. 6, ISBN 978-0-7695-4749-7 | | Publication language: | english |
|---|
| Original title: | Fast Linear Algebra on GPU |
|---|
| Title (cs): | Rychlá lineární algebra na GPU |
|---|
| Pages: | 6 |
|---|
| Proceedings: | IEEE conference proceedings |
|---|
| Conference: | The 14th IEEE International Conference on High Performance Computing and Communications |
|---|
| Place: | Liverpool, GB |
|---|
| Year: | 2012 |
|---|
| ISBN: | 978-0-7695-4749-7 |
|---|
| Publisher: | IEEE Computer Society |
|---|
| Files: | |
|---|
|
| | Keywords |
|---|
| GPU; parallel reduction; linear algebra;
BLAS; OpenCL; CUDA |
| Annotation |
|---|
| GPUs have been successfully
used for acceleration of many mathematical functions and libraries. A common
limitation of those libraries is the minimal size of primitives being handled,
in order to achieve a significant speedup compared to their CPU versions. The
minimal size requirement can prove prohibitive for many applications. It can be
loosened by batching operations in order to have sufficient amount of data to
perform the calculation maximally efficiently on the GPU. A fast OpenCL
implementation of two basic vector functions - vector reduction and vector
scaling - is described in this paper. Its performance is analyzed by running
benchmarks on two of the most common GPUs in use - Tesla and Fermi GPUs from
NVIDIA. Reported experimental results show that our implementation significantly
outperforms the current state-of-the-art GPU-based basic linear algebra library
CUBLAS. |
| BibTeX: |
|---|
@INPROCEEDINGS{
author = {Lukáš Polok and Pavel Smrž},
title = {Fast Linear Algebra on GPU},
pages = {6},
booktitle = {IEEE conference proceedings},
year = {2012},
location = {Liverpool, GB},
publisher = {IEEE Computer Society},
ISBN = {978-0-7695-4749-7},
language = {english},
url = {http://www.fit.vutbr.cz/research/view_pub.php?id=10039}
} |
|