*Result*: Systematically differentiating functions for alternatively spliced isoforms through integrating RNA-seq data.

Title:
Systematically differentiating functions for alternatively spliced isoforms through integrating RNA-seq data.
Authors:
Eksi R; Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America., Li HD, Menon R, Wen Y, Omenn GS, Kretzler M, Guan Y
Source:
PLoS computational biology [PLoS Comput Biol] 2013; Vol. 9 (11), pp. e1003314. Date of Electronic Publication: 2013 Nov 07.
Publication Type:
Journal Article; Research Support, N.I.H., Extramural; Research Support, Non-U.S. Gov't
Language:
English
Journal Info:
Publisher: Public Library of Science Country of Publication: United States NLM ID: 101238922 Publication Model: Print-Electronic Cited Medium: Internet ISSN: 1553-7358 (Electronic) Linking ISSN: 1553734X NLM ISO Abbreviation: PLoS Comput Biol Subsets: MEDLINE
Imprint Name(s):
Original Publication: San Francisco, CA : Public Library of Science, [2005]-
References:
Nucleic Acids Res. 2010 Jul;38(Web Server issue):W348-51. (PMID: 20551130)
Nucleic Acids Res. 2005 Apr 22;33(7):2302-9. (PMID: 15849316)
Int J Biochem Cell Biol. 2007;39(7-8):1432-49. (PMID: 17416541)
J Biol Chem. 2012 Oct 26;287(44):36663-72. (PMID: 22961981)
Nat Genet. 2008 Dec;40(12):1413-5. (PMID: 18978789)
Can J Neurol Sci. 2009 Jul;36(4):409-28. (PMID: 19650351)
BMC Bioinformatics. 2011 Jul 27;12:305. (PMID: 21794104)
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D258-61. (PMID: 14681407)
Bioinformatics. 2009 Apr 15;25(8):1026-32. (PMID: 19244387)
Nat Biotechnol. 2010 May;28(5):511-5. (PMID: 20436464)
Genome Biol. 2008;9 Suppl 1:S2. (PMID: 18613946)
Curr Mol Med. 2012 Jul 1;12(6):704-15. (PMID: 22292438)
J Proteome Res. 2011 Dec 2;10(12):5503-11. (PMID: 22003824)
Methods Mol Biol. 2011;696:319-26. (PMID: 21063957)
Genomics. 2001 May 15;74(1):121-8. (PMID: 11374909)
Brief Funct Genomics. 2012 Sep;11(5):405-15. (PMID: 22914042)
BMC Plant Biol. 2011 May 16;11(1):82. (PMID: 21575182)
Nucleic Acids Res. 2005 May 18;33(9):2822-37. (PMID: 15901854)
J Proteomics. 2013 Sep 2;90:28-37. (PMID: 23603631)
Bioinformatics. 2007 Oct 15;23(20):2692-9. (PMID: 17724061)
Nat Protoc. 2012 Mar 01;7(3):562-78. (PMID: 22383036)
Nat Protoc. 2010 Apr;5(4):725-38. (PMID: 20360767)
Mol Cell Biol. 2007 Dec;27(24):8431-41. (PMID: 17923691)
Nucleic Acids Res. 2011 Oct;39(18):7920-30. (PMID: 21724604)
BMC Genomics. 2006 Jul 25;7:187. (PMID: 16869964)
Annu Rev Biochem. 2003;72:291-336. (PMID: 12626338)
Genome Res. 2009 Feb;19(2):327-35. (PMID: 19029536)
Nature. 2012 Sep 6;489(7414):101-8. (PMID: 22955620)
Dis Markers. 2010;28(4):241-51. (PMID: 20534909)
BMC Bioinformatics. 2005 Aug 26;6:213. (PMID: 16124876)
J Comput Biol. 2010 Jan;17(1):55-72. (PMID: 20078397)
Biochim Biophys Acta. 2009 Jan;1792(1):14-26. (PMID: 18992329)
Nat Genet. 2000 May;25(1):25-9. (PMID: 10802651)
Bioinformatics. 2008 Mar 1;24(5):613-20. (PMID: 18174181)
BMC Bioinformatics. 2013;14 Suppl 3:S5. (PMID: 23514456)
Nucleic Acids Res. 2010 Jan;38(Database issue):D355-60. (PMID: 19880382)
Science. 2004 Dec 24;306(5705):2242-6. (PMID: 15539566)
J Biol Chem. 2005 Jun 10;280(23):22540-8. (PMID: 15824111)
Mol Cell. 2004 Dec 22;16(6):929-41. (PMID: 15610736)
J Biol. 2004;3(5):21. (PMID: 15588312)
Nature. 2012 Oct 4;490(7418):116-20. (PMID: 22885699)
BMC Bioinformatics. 2011 Aug 04;12:323. (PMID: 21816040)
Nucleic Acids Res. 2013 Jan;41(Database issue):D110-7. (PMID: 23161672)
Nucleic Acids Res. 2011 Jan;39(Database issue):D19-21. (PMID: 21062823)
Nat Rev Genet. 2009 Jan;10(1):57-63. (PMID: 19015660)
Bioinformatics. 2009 May 1;25(9):1105-11. (PMID: 19289445)
Bioinformatics. 2003;19 Suppl 1:i197-204. (PMID: 12855458)
Nat Methods. 2012 Mar 04;9(4):357-9. (PMID: 22388286)
J Proteome Res. 2007 Oct;6(10):3962-75. (PMID: 17711321)
BMC Struct Biol. 2011 Sep 18;11:34. (PMID: 21923943)
Genome Res. 2012 Sep;22(9):1760-74. (PMID: 22955987)
Bioinformatics. 2005 May 1;21(9):1979-86. (PMID: 15691862)
PLoS One. 2012;7(2):e32171. (PMID: 22393388)
Nat Methods. 2008 Jul;5(7):621-8. (PMID: 18516045)
Clin Cancer Res. 2006 Oct 1;12(19):5794-800. (PMID: 17020986)
Bioinformatics. 2004 Jun 12;20(9):1466-7. (PMID: 14976030)
J Pathol. 2012 Nov;228(3):274-85. (PMID: 22847733)
Nat Rev Cancer. 2007 Jan;7(1):23-34. (PMID: 17167517)
Bioinformatics. 2008 Jan 1;24(1):11-7. (PMID: 18006548)
J Neurooncol. 2001 Feb;51(3):219-29. (PMID: 11407594)
Genome Biol. 2013 Jul 01;14(7):R70. (PMID: 23815980)
PLoS Biol. 2011 Apr;9(4):e1001046. (PMID: 21526222)
Nature. 2010 May 6;465(7294):53-9. (PMID: 20445623)
Genome Biol. 2008;9 Suppl 1:S3. (PMID: 18613947)
Nucleic Acids Res. 2011 Mar;39(4):1208-19. (PMID: 20972208)
Nat Rev Mol Cell Biol. 2005 May;6(5):386-98. (PMID: 15956978)
Genome Res. 2012 Aug;22(8):1437-46. (PMID: 22665440)
PLoS Comput Biol. 2010 Nov 11;6(11):e1000991. (PMID: 21085640)
Biochemistry. 2012 Jul 17;51(28):5541-56. (PMID: 22708632)
J Comput Biol. 2011 Mar;18(3):305-21. (PMID: 21385036)
Genome Res. 2001 Aug;11(8):1410-7. (PMID: 11483582)
Genome Res. 2012 Nov;22(11):2120-9. (PMID: 23028188)
Nucleic Acids Res. 2011 Jul;39(12):4942-8. (PMID: 21398627)
Grant Information:
P30 DK081943 United States DK NIDDK NIH HHS; R21 NS082212 United States NS NINDS NIH HHS; R35 GM133346 United States GM NIGMS NIH HHS; NIH 1R21NS082212-01 United States NS NINDS NIH HHS
Substance Nomenclature:
0 (Protein Isoforms)
63231-63-0 (RNA)
Entry Date(s):
Date Created: 20131119 Date Completed: 20140714 Latest Revision: 20211021
Update Code:
20260130
PubMed Central ID:
PMC3820534
DOI:
10.1371/journal.pcbi.1003314
PMID:
24244129
Database:
MEDLINE

*Further Information*

*Integrating large-scale functional genomic data has significantly accelerated our understanding of gene functions. However, no algorithm has been developed to differentiate functions for isoforms of the same gene using high-throughput genomic data. This is because standard supervised learning requires 'ground-truth' functional annotations, which are lacking at the isoform level. To address this challenge, we developed a generic framework that interrogates public RNA-seq data at the transcript level to differentiate functions for alternatively spliced isoforms. For a specific function, our algorithm identifies the 'responsible' isoform(s) of a gene and generates classifying models at the isoform level instead of at the gene level. Through cross-validation, we demonstrated that our algorithm is effective in assigning functions to genes, especially the ones with multiple isoforms, and robust to gene expression levels and removal of homologous gene pairs. We identified genes in the mouse whose isoforms are predicted to have disparate functionalities and experimentally validated the 'responsible' isoforms using data from mammary tissue. With protein structure modeling and experimental evidence, we further validated the predicted isoform functional differences for the genes Cdkn2a and Anxa6. Our generic framework is the first to predict and differentiate functions for alternatively spliced isoforms, instead of genes, using genomic data. It is extendable to any base machine learner and other species with alternatively spliced isoforms, and shifts the current gene-centered function prediction to isoform-level predictions.*