All

What are you looking for?

All
Projects
Results
Organizations

Quick search

  • Projects supported by TA ČR
  • Excellent projects
  • Projects with the highest public support
  • Current projects

Smart search

  • That is how I find a specific +word
  • That is how I leave the -word out of the results
  • “That is how I can find the whole phrase”

Selecting Representative Data Sets

The result's identifiers

  • Result code in IS VaVaI

    <a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21460%2F12%3A00196428" target="_blank" >RIV/68407700:21460/12:00196428 - isvavai.cz</a>

  • Alternative codes found

    RIV/67985807:_____/12:00380642 RIV/68407700:21240/12:00196428

  • Result on the web

    <a href="http://www.intechopen.com/books/advances-in-data-mining-knowledge-discovery-and-applications/selecting-representative-data-sets" target="_blank" >http://www.intechopen.com/books/advances-in-data-mining-knowledge-discovery-and-applications/selecting-representative-data-sets</a>

  • DOI - Digital Object Identifier

    <a href="http://dx.doi.org/10.5772/50787" target="_blank" >10.5772/50787</a>

Alternative languages

  • Result language

    angličtina

  • Original language name

    Selecting Representative Data Sets

  • Original language description

    Many methods of Data Mining use data sets for setting their parameters, particularly training and testing sets. Setting of parameters corresponds to the learning (training) of the methods. It is e.g. a case of artificial neural networks and other adaptive (iterative) methods. Some of these methods utilize so-called validation set as well. A question that can arise is how to correctly divide or other way preprocess a given data set to these sets, i.e. how select data samples from the original set and place them into the training and testing sets. The chapter focuses on an overview of existing methods that deal with methods of data selection and sampling. A general approach to the problem of data selection to training, testing and eventually validation sets is discussed. To be able to compare individual approaches, model evaluation techniques are discussed as well. Data splitting is one of used approaches to construct training, testing and possibly validation sets, but there are many oth

  • Czech name

  • Czech description

Classification

  • Type

    C - Chapter in a specialist book

  • CEP classification

    IN - Informatics

  • OECD FORD branch

Result continuities

  • Project

    <a href="/en/project/LG12020" target="_blank" >LG12020: Advanced statistical analysis and non-statistical separation techniques for physical processing detection in data sets sampled by means of elementary particle accelerators.</a><br>

  • Continuities

    S - Specificky vyzkum na vysokych skolach

Others

  • Publication year

    2012

  • Confidentiality

    S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů

Data specific for result type

  • Book/collection name

    Advances in Data Mining Knowledge Discovery and Applications

  • ISBN

    978-953-51-0748-4

  • Number of pages of the result

    24

  • Pages from-to

    43-66

  • Number of pages of the book

    418

  • Publisher name

    InTech - Open Access Company (InTech Europe)

  • Place of publication

    Rijeka

  • UT code for WoS chapter