Performance analysis and autotuning setup of the cuFFT library
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14610%2F18%3A00106596" target="_blank" >RIV/00216224:14610/18:00106596 - isvavai.cz</a>
Result on the web
<a href="https://dl.acm.org/citation.cfm?id=3295817" target="_blank" >https://dl.acm.org/citation.cfm?id=3295817</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1145/3295816.3295817" target="_blank" >10.1145/3295816.3295817</a>
Alternative languages
Result language
angličtina
Original language name
Performance analysis and autotuning setup of the cuFFT library
Original language description
Fast Fourier transform (FFT) has many applications. It is often one of the most computationally demanding kernels, so a lot of attention has been invested into tuning its performance on various hardware devices. However, FFT libraries have usually many possible settings and it is not always easy to deduce which settings should be used for optimal performance. In practice, we can often slightly modify the FFT settings, for example, we can pad or crop input data. Surprisingly, a majority of state-of-the-art papers focus to answer the question how to implement FFT under given settings but do not pay much attention to the question which settings result in the fastest computation. In this paper, we target a popular implementation of FFT for GPU accelerators, the cuFFT library. We analyze the behavior and the performance of the cuFFT library with respect to input sizes and plan settings. We also present a new tool, cuFFTAdvisor, which proposes and by means of autotuning finds the best configuration of the library for given constraints of input size and plan settings. We experimentally show that our tool is able to propose different settings of the transformation, resulting in an average 6x speedup using fast heuristics and 6.9x speedup using autotuning.
Czech name
—
Czech description
—
Classification
Type
D - Article in proceedings
CEP classification
—
OECD FORD branch
10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)
Result continuities
Project
<a href="/en/project/EF16_013%2F0001802" target="_blank" >EF16_013/0001802: CERIT Scientific Cloud</a><br>
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2018
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Article name in the collection
ACM International Conference Proceeding Series
ISBN
9781450365918
ISSN
—
e-ISSN
—
Number of pages
6
Pages from-to
—
Publisher name
ACM
Place of publication
Limassol, Cyprus
Event location
Limassol, Cyprus
Event date
Jan 1, 2018
Type of event by nationality
WRD - Celosvětová akce
UT code for WoS article
000471021400001