A flexible denormalization technique for data analysis above a deeply-structured relational database: biomedical applications

Štefanič,  Stanislav; Lexa,  Matej

Informace o publikaci

A flexible denormalization technique for data analysis above a deeply-structured relational database: biomedical applications

Autoři	ŠTEFANIČ Stanislav LEXA Matej
Rok publikování	2015
Druh	Článek ve sborníku
Konference	Lecture Notes in Computer Science 9043, Bioinformatics and Biomedical Engineering, Third International Conference, IWBBIO 2015, Granada, Spain, April 15-17 2015, Proceedings, Part I
Fakulta / Pracoviště MU	Fakulta informatiky
Citace
www	http://link.springer.com/chapter/10.1007%2F978-3-319-16483-0_12
Doi	http://dx.doi.org/10.1007/978-3-319-16483-0_12
Obor	Informatika
Klíčová slova	relational database; PostgreSQL; NoSQL; data flattening; automatic data denormalization
Popis	Relational databases are sometimes used to store biomedical and patient data in large clinical or international projects. This data is inherently deeply structured, records for individual patients contain varying number of variables. When ad-hoc access to data subsets is needed, standard database access tools do not allow for rapid command prototyping and variable selection to create flat data tables. In the context of Thalamoss, an international research project on beta-thalassemia, we developed and experimented with an interactive variable selection method addressing these needs. Our newly-developed Python library sqlAutoDenorm.py automatically generates SQL commands to denormalize a subset of database tables and their relevant records, effectively generating a flat table from arbitrarily structured data. The denormalization process can be controlled by a small number of user-tunable parameters. Python and R/Bioconductor are used for any subsequent data processing steps, including visualization, and Weka is used for machine-learning above the generated data.
Související projekty:	THALAssaemia MOdular Stratification System for personalized therapy of beta-thalassemia

Jak na přijímačky

Důležité termíny

Přečtěte si o výzkumu na MU

Jak na přijímačky

Důležité termíny

Přečtěte si o výzkumu na MU

A flexible denormalization technique for data analysis above a deeply-structured relational database: biomedical applications