All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Using hardware performance counters to speed up autotuning convergence on GPUs

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216224%3A14610%2F22%3A00125022" target="_blank" >RIV/00216224:14610/22:00125022 - isvavai.cz</a>

  • Result on the web

    <a href="https://www.sciencedirect.com/science/article/pii/S0743731521001945?via%3Dihub" target="_blank" >https://www.sciencedirect.com/science/article/pii/S0743731521001945?via%3Dihub</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1016/j.jpdc.2021.10.003" target="_blank" >10.1016/j.jpdc.2021.10.003</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    Using hardware performance counters to speed up autotuning convergence on GPUs

  • Original language description

    Nowadays, GPU accelerators are commonly used to speed up general-purpose computing tasks on a variety of hardware. However, due to the diversity of GPU architectures and processed data, optimization of codes for a particular type of hardware and specific data characteristics can be extremely challenging. The autotuning of performance-relevant source-code parameters allows for automatic optimization of applications and keeps their performance portable. Although the autotuning process typically results in code speed-up, searching the tuning space can bring unacceptable overhead if (i) the tuning space is vast and full of poorly-performing implementations, or (ii) the autotuning process has to be repeated frequently because of changes in processed data or migration to different hardware. In this paper, we introduce a novel method for searching generic tuning spaces. The tuning spaces can contain tuning parameters changing any user-defined property of the source code. The method takes advantage of collecting hardware performance counters (also known as profiling counters) during empirical tuning. Those counters are used to navigate the searching process towards faster implementations. The method requires the tuning space to be sampled on any GPU. It builds a problem-specific model, which can be used during autotuning on various, even previously unseen inputs or GPUs. Using a set of five benchmarks, we experimentally demonstrate that our method can speed up autotuning when an application needs to be ported to different hardware or when it needs to process data with different characteristics. We also compared our method to state of the art and show that our method is superior in terms of the number of searching steps and typically outperforms other searches in terms of convergence time.

  • Czech name

  • Czech description

Classification

  • Type

    J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database

  • CEP classification

  • OECD FORD branch

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Result continuities

  • Project

    <a href="/en/project/EF16_013%2F0001802" target="_blank" >EF16_013/0001802: CERIT Scientific Cloud</a><br>

  • Continuities

    P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Others

  • Publication year

    2022

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Name of the periodical

    Journal of Parallel and Distributed Computing

  • ISSN

    0743-7315

  • e-ISSN

  • Volume of the periodical

    160

  • Issue of the periodical within the volume

    February

  • Country of publishing house

    NL - THE KINGDOM OF THE NETHERLANDS

  • Number of pages

    20

  • Pages from-to

    16-35

  • UT code for WoS article

    000711621300002

  • EID of the result in the Scopus database

    2-s2.0-85117699383