You are here:
Publication details
Dicier: a Fine-tuned Bioinformatics Pipeline to Detect Micro-RNA Chimeric Reads
Authors | |
---|---|
Year of publication | 2019 |
Type | Conference abstract |
MU Faculty or unit | |
Citation | |
Description | MicroRNAs are short non-coding RNAs (22nt) that associate with Argonaute proteins and mediate regulatory silencing of target RNAs. The identification of RNA targets is guided by complex sequence and structured interactions between miRNA loaded AGO protein and the target RNA. Binding of identified targets can lead to repression of translation or even deadenylation and RNA degradation. Each miRNA has potentially thousands of binding sites on the transcriptome, and each mRNA contains dozens of potential miRNA binding sites. The Ago-miRNA regulatory function plays central roles in several processes including tissue development and various diseases such as diabetes, cancer and heart disease among others. Cross-Linking Immuno-Precipitation (CLIP) is a molecular biology method that has been extensively used in recent years to locate transcriptomic regions where proteins bind mRNA. Next Generation Sequencing (NGS) combined with CLIP allowed a massive identification of those binding sites (CLIP-Seq). A major drawback of CLIP-Seq techniques when dealing with AGO-miRNA binding is that it is impossible to know which miRNA sequence, out of hundreds expressed, is responsible for each binding. Recently, an improvement of CLIP-Seq techniques was achieved with crosslinking, ligation, and sequencing of hybrids assay, which increased the yield of hybrid sequences between mRNA and its miRNA binding sites (chimeric sequences). These sequences contain both parts of the miRNA and target site thus unambiguously defining the miRNA responsible for each hybrid binding. However, the identification of chimeric sequences from CLIP experiments lacks up to date bioinformatics strategies. We have developed DICIER, a pipeline for the identification of miRNA binding site chimeric reads from CLIP-Seq experiments. DICIER maps raw reads to the reference genome and retrieves ambiguous alignments as candidate chimeric sequences. Those chimeric sequences are broken into two subsequences to first match a database of miRNA sequences and then to find the corresponding miRNAbinding site at the genomic level. DICIER output a table of miRNA and mRNA binding-site connections, which can be used as a reference for further studies. We benchmarked DICIER on both CLASH and CLIPSeq public datasets, and we identified N=250000 candidate miRNA-mRNA binding-sites connections. Despite the majority of those connections were originated from CLASH experiments, the increased sensitivity of DICIER allowed the identification of chimeric reads from CLIPseq in a proportion of up to 5% of total sample reads. |
Related projects: |