Conference paper

POLOK Lukáš, ILA Viorela S. and SMRŽ Pavel. Cache Efficient Implementation for Block Matrix Operations. In: Proceedings of the 21st High Performance Computing Symposium (HPC'13). San Diego: Association for Computing Machinery, 2013, pp. 698-706. ISBN 1-56555-350-0. Available from: http://dl.acm.org/citation.cfm?id=2499972
Publication language:english
Original title:Cache Efficient Implementation for Block Matrix Operations
Title (cs):Paměťově efektivní implementace operací pro blokové matice
Pages:698-706
Proceedings:Proceedings of the 21st High Performance Computing Symposium (HPC'13)
Conference:21st High Performance Computing Symposium
Place:San Diego, US
Year:2013
URL:http://dl.acm.org/citation.cfm?id=2499972
ISBN:1-56555-350-0
Publisher:Association for Computing Machinery
Files: 
+Type Name Title Size Last modified
iconhpc13_final_with_link.pdf220 KB2013-04-15 15:17:31
^ Select all
With selected:
Keywords
block matrix, high performance, sparse BLAS, nonlinear least squares
Annotation

Efficiently manipulating and operating on block matrices can be beneficial in many applications, among others those involving iteratively solving nonlinear systems. These types of problems consist of repeatedly assembling and solving sparse linear systems. In the case of very large systems, without a careful manipulation of the corresponding matrices, solving can become very time consuming.

This paper proposes a memory storage scheme convenient for both, numeric and structural matrix modification and, at the same time, allowing efficient arithmetic operation. This scheme was used in the implementation of a simple BLAS-like library. The advantage of the new scheme is demonstrated through exhaustive tests on the popular University of Florida Sparse Matrix Collection. Furthermore, this library was used in solving several nonlinear graph optimization problems.

Abstract
Tento článek se zabývá efektivní implementací operací nad řídkými blokovými maticemi na CPU, pomocí efektivního návrhu struktur pro uložení matic v paměti a pomocí agresivní optimalizace pomocí instrukčních sad SSE, AltiVec nebo NEON.

Dosahuje se velmi dobrých výsledků, jak s implementací samotnou, tak s jejím využití při řešení robotických problémů typu nonlinear least squares.

BibTeX:
@INPROCEEDINGS{
   author = {Luk{\'{a}}{\v{s}} Polok and S. Viorela Ila and Pavel
	Smr{\v{z}}},
   title = {Cache Efficient Implementation for Block Matrix Operations},
   pages = {698--706},
   booktitle = {Proceedings of the 21st High Performance Computing Symposium
	(HPC'13)},
   year = {2013},
   location = {San Diego, US},
   publisher = {Association for Computing Machinery},
   ISBN = {1-56555-350-0},
   language = {english},
   url = {http://www.fit.vutbr.cz/research/view_pub.php.en.iso-8859-2?id=10265}
}

Your IPv4 address: 54.226.172.30
Switch to IPv6 connection

DNSSEC [dnssec]