Comparing assignment-based approaches to breed identification within a large set of horses
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F00216305%3A26220%2F19%3APU131995" target="_blank" >RIV/00216305:26220/19:PU131995 - isvavai.cz</a>
Alternative codes found
RIV/62156489:43210/19:43915581
Result on the web
<a href="https://link.springer.com/article/10.1007/s13353-019-00495-x" target="_blank" >https://link.springer.com/article/10.1007/s13353-019-00495-x</a>
DOI - Digital Object Identifier
<a href="http://dx.doi.org/10.1007/s13353-019-00495-x" target="_blank" >10.1007/s13353-019-00495-x</a>
Alternative languages
Result language
angličtina
Original language name
Comparing assignment-based approaches to breed identification within a large set of horses
Original language description
Considering the extensive data sets and statistical techniques, animal breeding embodies a branch of machine learning that has a constantly increasing impact on breeding. In our study, information regarding the potential of machine learning and data mining within a large set of horses and breeds is presented. The individual assignment methods and factors influencing the success rate of the procedure are compared at the Czech population scale. The fixation index values ranged from 0.057 (HMS1) to 0.144 (HTG6), and the overall genetic differentiation amounted to 8.9% among the breeds. The highest genetic divergence (FST = 0.378) was established between the Friesian and Equus przewalskii; the highest degree of gene migration was obtained between the Czech and Bavarian Warmblood (Nm = 14,302); and the overall global heterozygote deficit across the populations was 10.4%. The eight standard methods (Bayesian, frequency, and distance) using GeneClass software and almost all mainstream classification algorithms (Bayes Net, Naive Bayes, IB1, IB5, KStar, JRip, J48, Random Forest, Random Tree, PART, MLP, and SVM) from the WEKA machine learning workbench were compared by utilizing 314,874 real allelic data sets. The Bayesian method (GeneClass, 89.9%) and Bayesian network algorithm (WEKA, 84.8%) outperformed the other techniques. The breed genomic prediction accuracy reached the highest value in the cold-blooded horses. The overall proportion of individuals correctly assigned to a population depended mainly on the breed number and genetic divergence. These statistical tools could be used to assess breed traceability systems, and they exhibit the potential to assist managers in decision-making as regards breeding and registration.
Czech name
—
Czech description
—
Classification
Type
J<sub>imp</sub> - Article in a specialist periodical, which is included in the Web of Science database
CEP classification
—
OECD FORD branch
10603 - Genetics and heredity (medical genetics to be 3)
Result continuities
Project
Result was created during the realization of more than one project. More information in the Projects tab.
Continuities
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Others
Publication year
2019
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
JOURNAL OF APPLIED GENETICS
ISSN
1234-1983
e-ISSN
2190-3883
Volume of the periodical
60
Issue of the periodical within the volume
2
Country of publishing house
PL - POLAND
Number of pages
12
Pages from-to
187-198
UT code for WoS article
000465998700008
EID of the result in the Scopus database
—