Vše

Co hledáte?

Vše
Projekty
Výsledky výzkumu
Subjekty

Rychlé hledání

  • Projekty podpořené TA ČR
  • Významné projekty
  • Projekty s nejvyšší státní podporou
  • Aktuálně běžící projekty

Chytré vyhledávání

  • Takto najdu konkrétní +slovo
  • Takto z výsledků -slovo zcela vynechám
  • “Takto můžu najít celou frázi”

Overcoming Long Inference Time of Nearest Neighbors Analysis in Regression and Uncertainty Prediction

Identifikátory výsledku

  • Kód výsledku v IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21240%2F24%3A00375001" target="_blank" >RIV/68407700:21240/24:00375001 - isvavai.cz</a>

  • Výsledek na webu

    <a href="https://doi.org/10.1007/s42979-024-02670-2" target="_blank" >https://doi.org/10.1007/s42979-024-02670-2</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.1007/s42979-024-02670-2" target="_blank" >10.1007/s42979-024-02670-2</a>

Alternativní jazyky

  • Jazyk výsledku

    angličtina

  • Název v původním jazyce

    Overcoming Long Inference Time of Nearest Neighbors Analysis in Regression and Uncertainty Prediction

  • Popis výsledku v původním jazyce

    The intuitive approach of comparing like with like, forms the basis of the so-called nearest neighbor analysis, which is central to many machine learning algorithms. Nearest neighbor analysis is easy to interpret, analyze, and reason about. It is widely used in advanced techniques such as uncertainty estimation in regression models, as well as the renowned k-nearest neighbor-based algorithms. Nevertheless, its high inference time complexity, which is dataset size dependent even in the case of its faster approximated version, restricts its applications and can considerably inflate the application cost. In this paper, we address the problem of high inference time complexity. By using gradient-boosted regression trees as a predictor of the labels obtained from nearest neighbor analysis, we demonstrate a significant increase in inference speed, improving by several orders of magnitude. We validate the effectiveness of our approach on a real-world European Car Pricing Dataset with approximately rows for both residual cost and price uncertainty prediction. Moreover, we assess our method’s performance on the most commonly used tabular benchmark datasets to demonstrate its scalability. The link is to github repository where the code is available: https://github.com/koutefra/uncertainty_experiments.

  • Název v anglickém jazyce

    Overcoming Long Inference Time of Nearest Neighbors Analysis in Regression and Uncertainty Prediction

  • Popis výsledku anglicky

    The intuitive approach of comparing like with like, forms the basis of the so-called nearest neighbor analysis, which is central to many machine learning algorithms. Nearest neighbor analysis is easy to interpret, analyze, and reason about. It is widely used in advanced techniques such as uncertainty estimation in regression models, as well as the renowned k-nearest neighbor-based algorithms. Nevertheless, its high inference time complexity, which is dataset size dependent even in the case of its faster approximated version, restricts its applications and can considerably inflate the application cost. In this paper, we address the problem of high inference time complexity. By using gradient-boosted regression trees as a predictor of the labels obtained from nearest neighbor analysis, we demonstrate a significant increase in inference speed, improving by several orders of magnitude. We validate the effectiveness of our approach on a real-world European Car Pricing Dataset with approximately rows for both residual cost and price uncertainty prediction. Moreover, we assess our method’s performance on the most commonly used tabular benchmark datasets to demonstrate its scalability. The link is to github repository where the code is available: https://github.com/koutefra/uncertainty_experiments.

Klasifikace

  • Druh

    J<sub>SC</sub> - Článek v periodiku v databázi SCOPUS

  • CEP obor

  • OECD FORD obor

    10201 - Computer sciences, information science, bioinformathics (hardware development to be 2.2, social aspect to be 5.8)

Návaznosti výsledku

  • Projekt

  • Návaznosti

    S - Specificky vyzkum na vysokych skolach<br>I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace

Ostatní

  • Rok uplatnění

    2024

  • Kód důvěrnosti údajů

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Údaje specifické pro druh výsledku

  • Název periodika

    SN Computer Science

  • ISSN

    2662-995X

  • e-ISSN

    2661-8907

  • Svazek periodika

    5

  • Číslo periodika v rámci svazku

    5

  • Stát vydavatele periodika

    SG - Singapurská republika

  • Počet stran výsledku

    12

  • Strana od-do

  • Kód UT WoS článku

  • EID výsledku v databázi Scopus

    2-s2.0-85191305190