Data normalization and scaling: Consequences for the analysis in omics sciences
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F61989592%3A15310%2F18%3A73589128" target="_blank" >RIV/61989592:15310/18:73589128 - isvavai.cz</a>
Výsledek na webu
—
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Data normalization and scaling: Consequences for the analysis in omics sciences
Popis výsledku v původním jazyce
The main task in the analysis of omics data is to understand biological information in the data. From a statistical point of view, classification analysis is one of the goals. If the data are consisting of groups of, e.g., controls and patients, the accurate prediction of new samples is desirable. To understand the processes in the human body or in other organisms, the interpretation of the model is necessary. In the two-group setting, the information about important features is one of the main tasks in omics disciplines. In metabolomics, the problem is called biomarker identification, while in genetics this is called fold changes problem, where it is examined for a feature, how many times the average concentration in one group is higher/lower than for the other group. In statistics, this is often referred to as the feature selection problem. Section 4 analyzes the impact of pretreatment methods on publicly available real-world data sets in terms of classification and feature selection analysis. As an example of omics disciplines, the data sets are originating from the metabolomics field. Section 5 discusses and summarizes the main findings and provides some overall recommendations.
Název v anglickém jazyce
Data normalization and scaling: Consequences for the analysis in omics sciences
Popis výsledku anglicky
The main task in the analysis of omics data is to understand biological information in the data. From a statistical point of view, classification analysis is one of the goals. If the data are consisting of groups of, e.g., controls and patients, the accurate prediction of new samples is desirable. To understand the processes in the human body or in other organisms, the interpretation of the model is necessary. In the two-group setting, the information about important features is one of the main tasks in omics disciplines. In metabolomics, the problem is called biomarker identification, while in genetics this is called fold changes problem, where it is examined for a feature, how many times the average concentration in one group is higher/lower than for the other group. In statistics, this is often referred to as the feature selection problem. Section 4 analyzes the impact of pretreatment methods on publicly available real-world data sets in terms of classification and feature selection analysis. As an example of omics disciplines, the data sets are originating from the metabolomics field. Section 5 discusses and summarizes the main findings and provides some overall recommendations.
Klasifikace
Druh
C - Kapitola v odborné knize
CEP obor
—
OECD FORD obor
10103 - Statistics and probability
Návaznosti výsledku
Projekt
<a href="/cs/project/GF15-34613L" target="_blank" >GF15-34613L: Statistika v metabolomice pro výzkum biomarkerů v medicíně</a><br>
Návaznosti
P - Projekt vyzkumu a vyvoje financovany z verejnych zdroju (s odkazem do CEP)
Ostatní
Rok uplatnění
2018
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Údaje specifické pro druh výsledku
Název knihy nebo sborníku
Data analysis for omics sciences: Methods and applications
ISBN
978-0-444-64044-4
Počet stran výsledku
32
Strana od-do
165-196
Počet stran knihy
706
Název nakladatele
Elsevier
Místo vydání
Amsterdam
Kód UT WoS kapitoly
—