Mixed Precision s-step Conjugate Gradient with Residual Replacement on GPUs
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216208%3A11320%2F22%3A10452867" target="_blank" >RIV/00216208:11320/22:10452867 - isvavai.cz</a>
Výsledek na webu
<a href="https://doi.org/10.1109/IPDPS53621.2022.00091" target="_blank" >https://doi.org/10.1109/IPDPS53621.2022.00091</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1109/IPDPS53621.2022.00091" target="_blank" >10.1109/IPDPS53621.2022.00091</a>
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Mixed Precision s-step Conjugate Gradient with Residual Replacement on GPUs
Popis výsledku v původním jazyce
The s-step Conjugate Gradient (CG) algorithm has the potential to reduce the communication cost of standard CG by a factor of s. However, though mathematically equivalent, s-step CG may be numerically less stable compared to standard CG in finite precision, exhibiting slower convergence and decreased attainable accuracy. This limits the use of s-step CG in practice. To improve the numerical behavior of s-step CG and overcome this potential limitation, we incorporate two techniques. First, we improve convergence behavior through the use of higher precision at critical parts of the s-step iteration and second, we integrate a residual replacement strategy into the resulting mixed precision s-step CG to improve attainable accuracy. Our experimental results on the Summit Supercomputer demonstrate that when the higher precision is implemented in hardware, these techniques have virtually no overhead on the iteration time while improving both the convergence rate and the attainable accuracy of s-step CG. Even when the higher precision is implemented in software, these techniques may still reduce the time-to-solution (speedups of up to 1.8times in our experiments), especially when s-step CG suffers from numerical instability with a small step size and the latency cost becomes a significant part of its iteration time.
Název v anglickém jazyce
Mixed Precision s-step Conjugate Gradient with Residual Replacement on GPUs
Popis výsledku anglicky
The s-step Conjugate Gradient (CG) algorithm has the potential to reduce the communication cost of standard CG by a factor of s. However, though mathematically equivalent, s-step CG may be numerically less stable compared to standard CG in finite precision, exhibiting slower convergence and decreased attainable accuracy. This limits the use of s-step CG in practice. To improve the numerical behavior of s-step CG and overcome this potential limitation, we incorporate two techniques. First, we improve convergence behavior through the use of higher precision at critical parts of the s-step iteration and second, we integrate a residual replacement strategy into the resulting mixed precision s-step CG to improve attainable accuracy. Our experimental results on the Summit Supercomputer demonstrate that when the higher precision is implemented in hardware, these techniques have virtually no overhead on the iteration time while improving both the convergence rate and the attainable accuracy of s-step CG. Even when the higher precision is implemented in software, these techniques may still reduce the time-to-solution (speedups of up to 1.8times in our experiments), especially when s-step CG suffers from numerical instability with a small step size and the latency cost becomes a significant part of its iteration time.
Klasifikace
Druh
D - Stať ve sborníku
CEP obor
—
OECD FORD obor
10102 - Applied mathematics
Návaznosti výsledku
Projekt
—
Návaznosti
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Ostatní
Rok uplatnění
2022
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název statě ve sborníku
Proceedings - 2022 IEEE 36th International Parallel and Distributed Processing Symposium, IPDPS 2022
ISBN
978-1-66548-106-9
ISSN
1530-2075
e-ISSN
—
Počet stran výsledku
11
Strana od-do
886-896
Název nakladatele
IEEE
Místo vydání
New York
Místo konání akce
Ecole Normale Supérieure de Lyon
Datum konání akce
30. 5. 2022
Typ akce podle státní příslušnosti
WRD - Celosvětová akce
Kód UT WoS článku
000854096200083