Selecting Representative Data Sets
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F68407700%3A21460%2F12%3A00196428" target="_blank" >RIV/68407700:21460/12:00196428 - isvavai.cz</a>
Alternative codes found
RIV/67985807:_____/12:00380642 RIV/68407700:21240/12:00196428
Result on the web
<a href="http://www.intechopen.com/books/advances-in-data-mining-knowledge-discovery-and-applications/selecting-representative-data-sets" target="_blank" >http://www.intechopen.com/books/advances-in-data-mining-knowledge-discovery-and-applications/selecting-representative-data-sets</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.5772/50787" target="_blank" >10.5772/50787</a>
Alternative languages
Result language
angličtina
Original language name
Selecting Representative Data Sets
Original language description
Many methods of Data Mining use data sets for setting their parameters, particularly training and testing sets. Setting of parameters corresponds to the learning (training) of the methods. It is e.g. a case of artificial neural networks and other adaptive (iterative) methods. Some of these methods utilize so-called validation set as well. A question that can arise is how to correctly divide or other way preprocess a given data set to these sets, i.e. how select data samples from the original set and place them into the training and testing sets. The chapter focuses on an overview of existing methods that deal with methods of data selection and sampling. A general approach to the problem of data selection to training, testing and eventually validation sets is discussed. To be able to compare individual approaches, model evaluation techniques are discussed as well. Data splitting is one of used approaches to construct training, testing and possibly validation sets, but there are many oth
Czech name
—
Czech description
—
Classification
Type
C - Chapter in a specialist book
CEP classification
IN - Informatics
OECD FORD branch
—
Result continuities
Project
<a href="/en/project/LG12020" target="_blank" >LG12020: Advanced statistical analysis and non-statistical separation techniques for physical processing detection in data sets sampled by means of elementary particle accelerators.</a><br>
Continuities
S - Specificky vyzkum na vysokych skolach
Others
Publication year
2012
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Book/collection name
Advances in Data Mining Knowledge Discovery and Applications
ISBN
978-953-51-0748-4
Number of pages of the result
24
Pages from-to
43-66
Number of pages of the book
418
Publisher name
InTech - Open Access Company (InTech Europe)
Place of publication
Rijeka
UT code for WoS chapter
—