A Comparative study of two methodologies for large binary datasets analysis.
The result's identifiers
Result code in IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F67985807%3A_____%2F12%3A00387233" target="_blank" >RIV/67985807:_____/12:00387233 - isvavai.cz</a>
Result on the web
—
DOI - Digital Object Identifier
—
Alternative languages
Result language
angličtina
Original language name
A Comparative study of two methodologies for large binary datasets analysis.
Original language description
Studied are differences of two approaches targeted to reveal latent variables in binary data. These approaches assume that the observed high dimensional data are driven by a small number of hidden binary sources combined due to Boolean superposition. Thefirst approach is the Boolean matrix factorization (BMF) and the second one is the Boolean factor analysis (BFA). The two BMF methods are used for comparison. First is the M8 method from the BMDP statistical software package and the second one is the method suggested by Belohlavek & Vychodil. These two are compared to BFA, especially with the Expectation-maximization Boolean Factor Analysis we had developed earlier has, however, been extended with a binarization step developed here. The well-known bars problem and the mushroom dataset are used for revealing the methods' peculiarities. In particular, the reconstruction ability of the computed factors and the information gain as the measure of dimension reduction was under scrutiny. It
Czech name
—
Czech description
—
Classification
Type
J<sub>x</sub> - Unclassified - Peer-reviewed scientific article (Jimp, Jsc and Jost)
CEP classification
BB - Applied statistics, operational research
OECD FORD branch
—
Result continuities
Project
<a href="/en/project/GAP202%2F10%2F0262" target="_blank" >GAP202/10/0262: Decompositions of matrices with binary and ordinal data: theory, algorithms, and complexity</a><br>
Continuities
I - Institucionalni podpora na dlouhodoby koncepcni rozvoj vyzkumne organizace
Others
Publication year
2012
Confidentiality
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů
Data specific for result type
Name of the periodical
Neural Network World
ISSN
1210-0552
e-ISSN
—
Volume of the periodical
22
Issue of the periodical within the volume
6
Country of publishing house
CZ - CZECH REPUBLIC
Number of pages
18
Pages from-to
565-582
UT code for WoS article
000314321300006
EID of the result in the Scopus database
—