Treffer: Modeling alternative splicing variants from RNA-Seq data with isoform graphs.

Title:
Modeling alternative splicing variants from RNA-Seq data with isoform graphs.
Authors:
Beretta S; 1 Dipartimento di Informatica Sistemistica e Comunicazione, Università degli Studi di Milano-Bicocca , Milan, Italy ., Bonizzoni P, Vedova GD, Pirola Y, Rizzi R
Source:
Journal of computational biology : a journal of computational molecular cell biology [J Comput Biol] 2014 Jan; Vol. 21 (1), pp. 16-40. Date of Electronic Publication: 2013 Nov 07.
Publication Type:
Journal Article; Research Support, Non-U.S. Gov't
Language:
English
Journal Info:
Publisher: Mary Ann Liebert, Inc Country of Publication: United States NLM ID: 9433358 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 1557-8666 (Electronic) Linking ISSN: 10665277 NLM ISO Abbreviation: J Comput Biol Subsets: MEDLINE
Imprint Name(s):
Original Publication: New York, NY : Mary Ann Liebert, Inc., c1994-
References:
Nat Methods. 2010 Nov;7(11):909-12. (PMID: 20935650)
PLoS Comput Biol. 2008 Aug 08;4(8):e1000147. (PMID: 18688268)
Nat Biotechnol. 2010 May;28(5):511-5. (PMID: 20436464)
Nat Rev Genet. 2010 Jan;11(1):31-46. (PMID: 19997069)
Nat Protoc. 2012 Mar 01;7(3):562-78. (PMID: 22383036)
Genome Res. 2001 Nov;11(11):1952-7. (PMID: 11691860)
Genome Res. 2004 May;14(5):976-87. (PMID: 15123595)
Bioinformatics. 2005 May 1;21(9):1859-75. (PMID: 15728110)
Bioinformatics. 2012 Apr 15;28(8):1086-92. (PMID: 22368243)
BMC Bioinformatics. 2012 Apr 12;13 Suppl 5:S2. (PMID: 22537006)
J Mol Biol. 1990 Oct 5;215(3):403-10. (PMID: 2231712)
Bioinformatics. 2002;18 Suppl 1:S181-8. (PMID: 12169546)
BMC Bioinformatics. 2011 Dec 14;12 Suppl 14:S2. (PMID: 22373417)
Nat Biotechnol. 2011 May 15;29(7):644-52. (PMID: 21572440)
Algorithms Mol Biol. 2011 Apr 19;6(1):9. (PMID: 21504602)
BMC Bioinformatics. 2012 Apr 19;13 Suppl 6:S5. (PMID: 22537044)
Genome Biol. 2006;7 Suppl 1:S2.1-31. (PMID: 16925836)
Nat Rev Genet. 2011 Nov 29;13(1):36-46. (PMID: 22124482)
J Comput Biol. 2011 Mar;18(3):305-21. (PMID: 21385036)
Trends Genet. 2002 Apr;18(4):186-93. (PMID: 11932019)
Entry Date(s):
Date Created: 20131109 Date Completed: 20140813 Latest Revision: 20211021
Update Code:
20260130
PubMed Central ID:
PMC3880078
DOI:
10.1089/cmb.2013.0112
PMID:
24200390
Database:
MEDLINE

Weitere Informationen

Next-generation sequencing (NGS) technologies need new methodologies for alternative splicing (AS) analysis. Current computational methods for AS analysis from NGS data are mainly based on aligning short reads against a reference genome, while methods that do not need a reference genome are mostly underdeveloped. In this context, the main developed tools for NGS data focus on de novo transcriptome assembly (Grabherr et al., 2011 ; Schulz et al., 2012). While these tools are extensively applied for biological investigations and often show intrinsic shortcomings from the obtained results, a theoretical investigation of the inherent computational limits of transcriptome analysis from NGS data, when a reference genome is unknown or highly unreliable, is still missing. On the other hand, we still lack methods for computing the gene structures due to AS events under the above assumptions--a problem that we start to tackle with this article. More precisely, based on the notion of isoform graph (Lacroix et al., 2008), we define a compact representation of gene structures--called splicing graph--and investigate the computational problem of building a splicing graph that is (i) compatible with NGS data and (ii) isomorphic to the isoform graph. We characterize when there is only one representative splicing graph compatible with input data, and we propose an efficient algorithmic approach to compute this graph.