Informace o projektu
Deep Learning for Genomic and Transcriptomic Pattern Identification
- Kód projektu
- 4431
- Období řešení
- 1/2020 - 12/2022
- Investor / Programový rámec / typ projektu
-
EMBO (European Molecular Biology Organization)
- Projekty EMBO
- Fakulta / Pracoviště MU
- Středoevropský technologický institut
Research in my newly formed laboratory revolves around the utilization of novel Deep Neural Network approaches to identify patterns in genomic and transcriptomic regions harboring functional elements. Specifically, we focus on the characterisation of three categories of functional elements: short genomic functional elements (e.g. small RNA gene loci), transcriptomic functional elements (e.g. RNA Binding Protein binding sites), and small RNA driven transcriptomic functional elements (e.g. microRNA target sites). The identification of such functional elements using in silico methods has been a field of intensive research, but the current low precision of methods when scanning over large regions of the genome/transcriptome has confined practical implementation to a small minority of well studied and easy to identify elements (e.g. microRNA target sites), and heavily biased by the prior theoretical knowledge. My research instead focuses on less biased methods of modelling these complex biological processes from raw data (genomic or high-throughput sequencing) using Deep Learning architecture to achieve pattern identification precision levels at unprecedented levels. For genomic functional elements we have developed a novel training approach involving iterative background selection that has boosted the accuracy of small RNA identification orders of magnitude beyond the state of the art. For transcriptomic functional
elements, we will utilize characteristics of binding to train a Deep Learning model on CLIP-Seq data from hundreds of sequenced RBPs. We are exploring the interpretation of the trained model aspects in order to predict functionality of novel enigmatic RBPs based on their binding characteristics. Finally, for small RNA driven functional elements we utilize Deep Learning models to identify unbiased binding rules from chimeric CLIP-Seq reads beyond the theoretical biases existing in current models.
Publikace
Počet publikací: 1
2023
-
Genomic benchmarks: a collection of datasets for genomic sequence classification
BMC Genomic Data, rok: 2023, ročník: 24, vydání: 1, DOI