All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

A GPU-Architecture Optimized Hierarchical Decomposition Algorithm for Support Vector Machine Training

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F17%3A43932982" target="_blank" >RIV/49777513:23520/17:43932982 - isvavai.cz</a>

  • Result on the web

    <a href="http://ieeexplore.ieee.org/document/7990554/?reload=true" target="_blank" >http://ieeexplore.ieee.org/document/7990554/?reload=true</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1109/TPDS.2017.2731764" target="_blank" >10.1109/TPDS.2017.2731764</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    A GPU-Architecture Optimized Hierarchical Decomposition Algorithm for Support Vector Machine Training

  • Original language description

    In the last decade, several GPU implementations of Support Vector Machine (SVM) training with nonlinear kernels were published. Some of them even with source codes. The most effective ones are based on Sequential Minimal Optimization (SMO). They decompose the restricted quadratic problem into a series of smallest possible subproblems, which are then solved analytically. For large datasets, the majority of elapsed time is spent by a large amount of matrix-vector multiplications that cannot be computed efficiently on current GPUs because of limited memory bandwidth. In this paper, we introduce a novel GPU approach to the SVM training that we call Optimized Hierarchical Decomposition SVM (OHD-SVM). It uses a hierarchical decomposition iterative algorithm that fits better to actual GPU architecture. The low decomposition level uses a single GPU multiprocessor to efficiently solve a local subproblem. Nowadays a single GPU multiprocessor can run thousand or more threads that are able to synchronize quickly. It is an ideal platform for a single kernel SMO-based local solver with fast local iterations. The high decomposition level updates gradients of entire training set and selects a new local working set. The gradient update requires many kernel values that are costly to compute. However, solving a large local subproblem offers an efficient kernel values computation via a matrix-matrix multiplication that is much more efficient than the matrix-vector multiplication used in already published implementations. Along with a description of our implementation, the paper includes an exact comparison of five publicly available C++ SVM training GPU implementations. In this paper, the binary classification task and RBF kernel function are taken into account as it is usual in most of the recent papers. According to the measured results on a wide set of publicly available datasets, our proposed approach excelled significantly over the other methods in all datasets. The biggest difference was on the largest dataset where we achieved speed-up up to 12 times in comparison with the fastest already published GPU implementation. Moreover, our OHD-SVM is the only one that can handle dense as well as sparse datasets. Along with this paper, we published the source-codes at https://github.com/OrcusCZ/OHD-SVM.

  • Czech name

  • Czech description

Classification

  • Type

    J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database

  • CEP classification

  • OECD FORD branch

    20205 - Automation and control systems

Result continuities

  • Project

    <a href="/en/project/GBP103%2F12%2FG084" target="_blank" >GBP103/12/G084: Center for Large Scale Multi-modal Data Interpretation</a><br>

  • Continuities

    P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)

Others

  • Publication year

    2017

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Name of the periodical

    IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS

  • ISSN

    1045-9219

  • e-ISSN

  • Volume of the periodical

    28

  • Issue of the periodical within the volume

    12

  • Country of publishing house

    US - UNITED STATES

  • Number of pages

    14

  • Pages from-to

    3330-3343

  • UT code for WoS article

    000415179500002

  • EID of the result in the Scopus database

    2-s2.0-85028954180