Chinese Science Bulletin, Volume 61 , Issue 36 : 3869-3877(2016) https://doi.org/10.1360/N972016-00900

From big biological data to big discovery: The past decade and the future

More info
  • ReceivedAug 18, 2016
  • AcceptedSep 14, 2016
  • PublishedNov 23, 2016


Funded by



[1] Lander E S, Linton L M, Birren B, et al. Initial sequencing and analysis of the human genome. Nature, 2001, 409: 860-921 CrossRef PubMed Google Scholar

[2] Pennisi E. How Will Big Pictures Emerge From a Sea of Biological Data?. Science, 2005, 309: 94-94 CrossRef PubMed Google Scholar

[3] The 1000 Genomes Project Consortium. An integrated map of genetic variation from 1092 human genomes. Nature, 2012, 491: 56–65. Google Scholar

[4] Tennessen J A, Bigham A W, O'Connor T D, et al. Evolution and Functional Impact of Rare Coding Variation from Deep Sequencing of Human Exomes. Science, 2012, 337: 64-69 CrossRef PubMed ADS Google Scholar

[5] Mailman M D, Feolo M, Jin Y M, et al. The NCBI dbGaP database of genotypes and phenotypes. Nat Genet, 2007, 39: 1181–1186. Google Scholar

[6] Bamshad M J, Ng S B, Bigham A W, et al. Exome sequencing as a tool for Mendelian disease gene discovery. Nat Rev Genet, 2011, 12: 745-755 CrossRef PubMed Google Scholar

[7] Kiezun A, Garimella K, Do R, et al. Exome sequencing and the genetic basis of complex traits. Nat Genet, 2012, 44: 623-630 CrossRef PubMed Google Scholar

[8] Dewey F E, Grove M E, Pan C, et al. Clinical Interpretation and Implications of Whole-Genome Sequencing. JAMA, 2014, 311: 1035-1045 CrossRef PubMed Google Scholar

[9] Cirulli E T, Goldstein D B. Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nat Rev Genet 2010, 11: 415–425. Google Scholar

[10] Ng S B, Buckingham K J, Lee C, et al. Exome sequencing identifies the cause of a mendelian disorder. Nat Genet, 2010, 42: 30-35 CrossRef PubMed Google Scholar

[11] Sanders S J, Murtha M T, Gupta A R, et al. De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature, 2012, 485: 237-241 CrossRef PubMed ADS Google Scholar

[12] Cirulli E T, Lasseigne B N, Petrovski S, et al. Exome sequencing in amyotrophic lateral sclerosis identifies risk genes and pathways. Science, 2015, 347: 1436-1441 CrossRef PubMed ADS Google Scholar

[13] Smith B N, Ticozzi N, Fallini C, et al. Exome-wide Rare Variant Analysis Identifies TUBA4A Mutations Associated with Familial ALS. Neuron, 2014, 84: 324-331 CrossRef PubMed Google Scholar

[14] Purcell S M, Moran J L, Fromer M, et al. A polygenic burden of rare disruptive mutations in schizophrenia. Nature, 2014, 506: 185-190 CrossRef PubMed ADS Google Scholar

[15] Peloso G M, Auer P L, Bis J C, et al. Association of Low-Frequency and Rare Coding-Sequence Variants with Blood Lipids and Coronary Heart Disease in 56,000 Whites and Blacks. Am J Human Genets, 2014, 94: 223-232 CrossRef PubMed Google Scholar

[16] Lohmueller K E, Sparsø T, Li Q, et al. Whole-exome sequencing of 2000 Danish individuals and the role of rare coding variants in type 2 diabetes. Am J Human Genet, 2013, 93: 1072–1086. Google Scholar

[17] Diogo D, Kurreeman F, Stahl E A, et al. Rare, Low-Frequency, and Common Variants in the Protein-Coding Sequence of Biological Candidate Genes from GWASs Contribute to Risk of Rheumatoid Arthritis. Am J Human Genets, 2013, 92: 15-27 CrossRef PubMed Google Scholar

[18] Sobreira N L M, Cirulli E T, Avramopoulos D, et al. Whole-Genome Sequencing of a Single Proband Together with Linkage Analysis Identifies a Mendelian Disease Gene. PLoS Genet, 2010, 6: e1000991 CrossRef PubMed Google Scholar

[19] Lupski J R, Reid J G, Gonzaga-Jauregui C, et al. Whole-Genome Sequencing in a Patient with Charcot–Marie–Tooth Neuropathy. N Engl J Med, 2010, 362: 1181-1191 CrossRef PubMed Google Scholar

[20] Veeramah K R, O'Brien J E, Meisler M H, et al. De Novo Pathogenic SCN8A Mutation Identified by Whole-Genome Sequencing of a Family Quartet Affected by Infantile Epileptic Encephalopathy and SUDEP. Am J Human Genets, 2012, 90: 502-510 CrossRef PubMed Google Scholar

[21] Michaelson J J, Shi Y, Gujral M, et al. Whole-Genome Sequencing in Autism Identifies Hot Spots for De Novo Germline Mutation. Cell, 2012, 151: 1431-1442 CrossRef PubMed Google Scholar

[22] Jiang Y, Yuen R K C, Jin X, et al. Detection of Clinically Relevant Genetic Variants in Autism Spectrum Disorder by Whole-Genome Sequencing. Am J Human Genets, 2013, 93: 249-263 CrossRef PubMed Google Scholar

[23] Martin H C, Kim G E, Pagnamenta A T, et al. Clinical whole-genome sequencing in severe early-onset epilepsy reveals new genes and improves molecular diagnosis. Human Mol Genets, 2014, 23: 3200-3211 CrossRef PubMed Google Scholar

[24] Willig L K, Petrikin J E, Smith L D, et al. Whole-genome sequencing for identification of Mendelian disorders in critically ill infants: a retrospective analysis of diagnostic and clinical findings. Lancet Respiratory Med, 2015, 3: 377-387 CrossRef Google Scholar

[25] Gaj T, Gersbach C A, Barbas Iii C F. ZFN, TALEN, and CRISPR/Cas-based methods for genome engineering. Trends Biotech, 2013, 31: 397-405 CrossRef PubMed Google Scholar

[26] Karginov F V, Hannon G J. The CRISPR System: Small RNA-Guided Defense in Bacteria and Archaea. Mol Cell, 2010, 37: 7-19 CrossRef PubMed Google Scholar

[27] Jinek M, Chylinski K, Fonfara I, et al. A Programmable Dual-RNA-Guided DNA Endonuclease in Adaptive Bacterial Immunity. Science, 2012, 337: 816-821 CrossRef PubMed ADS Google Scholar

[28] Liang P, Xu Y, Zhang X, et al. CRISPR/Cas9-mediated gene editing in human tripronuclear zygotes. Protein Cell, 2015, 6: 363-372 CrossRef PubMed Google Scholar

[29] Shalem O, Sanjana N E, Hartenian E, et al. Genome-Scale CRISPR-Cas9 Knockout Screening in Human Cells. Science, 2014, 343: 84-87 CrossRef PubMed ADS Google Scholar

[30] Shi J, Wang E, Milazzo J P, et al. Discovery of cancer drug targets by CRISPR-Cas9 screening of protein domains. Nat Biotechnol, 2015, 33: 661-667 CrossRef PubMed Google Scholar

[31] Davidson E A, Windram O P, Bayer T S. Building synthetic systems to learn nature’s design principles. Adv Exp Med Biol, 2012, 751: 411–429. Google Scholar

[32] Yuan Y, Liu B, Xie P, et al. Model-guided quantitative analysis of microRNA-mediated regulation on competing endogenous RNAs using a synthetic gene circuit. Proc Natl Acad Sci USA, 2015, 112: 3158-3163 CrossRef PubMed ADS Google Scholar

[33] Weinberger L S, Burnett J C, Toettcher J E, et al. Stochastic Gene Expression in a Lentiviral Positive-Feedback Loop: HIV-1 Tat Fluctuations Drive Phenotypic Diversity. Cell, 2005, 122: 169-182 CrossRef PubMed Google Scholar

[34] Weinberger L S. A minimal fate-selection switch. Curr Opin Cell Biol, 2015, 37: 111-118 CrossRef PubMed Google Scholar

[35] Dar R D, Hosmane N N, Arkin M R, et al. Screening for noise in gene expression identifies drug synergies. Science, 2014, 344: 1392-1396 CrossRef PubMed ADS Google Scholar

[36] Kang T, Moore R, Li Y, et al. Discriminating direct and indirect connectivities in biological networks. Proc Natl Acad Sci USA, 2015, 112: 12893-12898 CrossRef PubMed ADS Google Scholar

[37] Schmiedel J M, Klemm S L, Zheng Y, et al. MicroRNA control of protein expression noise. Science, 2015, 348: 128-132 CrossRef PubMed ADS Google Scholar

[38] Chen W, Zheng R, Baade P D, et al. Cancer statistics in China, 2015. CA Cancer J Clin, 2016, 66: 115–132. Google Scholar

[39] Hoadley K A, Yau C, Wolf D M, et al. Multiplatform Analysis of 12 Cancer Types Reveals Molecular Classification within and across Tissues of Origin. Cell, 2014, 158: 929-944 CrossRef PubMed Google Scholar

[40] Hanahan D, Weinberg R A. Hallmarks of Cancer: The Next Generation. Cell, 2011, 144: 646-674 CrossRef PubMed Google Scholar

[41] , Weinstein J N, Collisson E A, et al. The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet, 2013, 45: 1113-1120 CrossRef PubMed Google Scholar

[42] Lawrence M S, Stojanov P, Mermel C H, et al. Discovery and saturation analysis of cancer genes across 21 tumour types. Nature, 2014, 505: 495-501 CrossRef PubMed ADS Google Scholar

[43] Zack T I, Schumacher S E, Carter S L, et al. Pan-cancer patterns of somatic copy number alteration. Nat Genet, 2013, 45: 1134-1140 CrossRef PubMed Google Scholar

[44] McGranahan N, Furness A J S, Rosenthal R, et al. Clonal neoantigens elicit T cell immunoreactivity and sensitivity to immune checkpoint blockade. Science, 2016, 351: 1463-1469 CrossRef PubMed ADS Google Scholar

[45] Rubio-Perez C, Tamborero D, Schroeder M P, et al. In Silico Prescription of Anticancer Drugs to Cohorts of 28 Tumor Types Reveals Targeting Opportunities. Cancer Cell, 2015, 27: 382-396 CrossRef PubMed Google Scholar

[46] Ding Z, Zu S, Gu J. Evaluating the molecule-based prediction of clinical drug responses in cancer. Bioinformatics, 2016, 32: 2891-2895 CrossRef PubMed Google Scholar

[47] Gill S R, Pop M, Deboy R T, et al. Metagenomic Analysis of the Human Distal Gut Microbiome. Science, 2006, 312: 1355-1359 CrossRef PubMed ADS Google Scholar

[48] Gordon J I. Honor Thy Gut Symbionts Redux. Science, 2012, 336: 1251-1253 CrossRef PubMed ADS Google Scholar

[49] Grice E A, Segre J A. The Human Microbiome: Our Second Genome *. Annu Rev Genom Hum Genet, 2012, 13: 151-170 CrossRef PubMed Google Scholar

[50] Turnbaugh P J, Hamady M, Yatsunenko T, et al. A core gut microbiome in obese and lean twins. Nature, 2009, 457: 480-484 CrossRef PubMed ADS Google Scholar

[51] Smith M I, Yatsunenko T, Manary M J, et al. Gut Microbiomes of Malawian Twin Pairs Discordant for Kwashiorkor. Science, 2013, 339: 548-554 CrossRef PubMed ADS Google Scholar

[52] Wang J, Qi J, Zhao H, et al. Metagenomic sequencing reveals microbiota and its functional potentials associated with periodontal disease. Sci Rep, 2013, 3: 1843. Google Scholar

[53] Qin J, Li R, Raes J, et al. A human gut microbial gene catalogue established by metagenomic sequencing. Nature, 2010, 464: 59-65 CrossRef PubMed ADS Google Scholar

[54] Qin J, Li Y, Cai Z, et al. A metagenome-wide association study of gut microbiota in type 2 diabetes. Nature, 2012, 490: 55-60 CrossRef PubMed ADS Google Scholar

[55] Sears C L, Garrett W S. Microbes, Microbiota, and Colon Cancer. Cell Host Microbe, 2014, 15: 317-328 CrossRef PubMed Google Scholar

[56] Hsiao E Y, McBride S W, Hsien S, et al. Microbiota Modulate Behavioral and Physiological Abnormalities Associated with Neurodevelopmental Disorders. Cell, 2013, 155: 1451-1463 CrossRef PubMed Google Scholar

[57] Nicholson J K, Holmes E, Kinross J, et al. Host-Gut Microbiota Metabolic Interactions. Science, 2012, 336: 1262-1267 CrossRef PubMed ADS Google Scholar

[58] Maurice C F, Haiser H J, Turnbaugh P J. Xenobiotics Shape the Physiology and Gene Expression of the Active Human Gut Microbiome. Cell, 2013, 152: 39-50 CrossRef PubMed Google Scholar

[59] Yano J M, Yu K, Donaldson G P, et al. Indigenous Bacteria from the Gut Microbiota Regulate Host Serotonin Biosynthesis. Cell, 2015, 161: 264-276 CrossRef PubMed Google Scholar

[60] Borody T J, Khoruts A. Fecal microbiota transplantation and emerging applications. Nat Rev Gastroent Hepatol, 2012, 9: 88–96. Google Scholar

[61] Methé B A, Nelson K E, Pop M, et al. A framework for human microbiome research. Nature, 2012, 486: 215-221 CrossRef PubMed ADS Google Scholar

[62] Zhou X, Brown C J, Abdo Z, et al. Differences in the composition of vaginal microbial communities found in healthy Caucasian and black women. ISME J, 2007, 1: 121-133 CrossRef PubMed Google Scholar

[63] Arumugam M, Raes J, Pelletier E, et al. Enterotypes of the human gut microbiome. Nature, 2011, 473: 174-180 CrossRef PubMed ADS Google Scholar

[64] van Opstal E J, Bordenstein S R. Rethinking heritability of the microbiome. Science, 2015, 349: 1172-1173 CrossRef PubMed ADS Google Scholar

[65] Alivisatos A P, Blaser M J, Brodie E L, et al. A unified initiative to harness Earth's microbiomes. Science, 2015, 350: 507-508 CrossRef PubMed ADS Google Scholar

[66] Dubilier N, McFall-Ngai M, Zhao L. Microbiology: Create a global microbiome effort. Nature, 2015, 526: 631-634 CrossRef PubMed ADS Google Scholar

[67] Patti G J, Yanes O, Siuzdak G. Innovation: Metabolomics: the apogee of the omics trilogy. Nat Rev Mol Cell Biol, 2012, 13: 263-269 CrossRef PubMed Google Scholar

[68] Jain K K. Role of proteomics in the development of personalized medicine. Adv Protein Chem Struct Biol, 2016, 102: 41–52. Google Scholar

  • 张学工

    1989年毕业于清华大学自动化系, 1994年于清华大学获模式识别与智能系统专业博士学位, 现为清华大学自动化系教授, 生命科学学院和医学院兼职教授, 清华信息国家实验室生物信息学研究部主任, 清华大学合成与系统生物学中心执行主任, 生物信息学教育部重点实验室副主任, 自动化系学术委员会主任, 清华大学学术委员会委员, 中国人工智能学会生物信息学与人工生命专业委员会主任、中国生物工程学会生物信息学与计算生物学专业委员会常务副主任、中国生物物理学会生物信息学与理论生物物理专业委员会副主任. 2001~2002年赴美国哈佛大学公共卫生学院进修. 2006年获国家杰出青年科学基金, 是国家级精品课程主讲人, 2009年获国家级教学成果二等奖, 2011年成为国家重点基础研究发展计划(973计划)首席科学家. 主要研究方向是模式识别与生物信息学, 包括: 生物大数据分析与机器学习、新一代测序数据的处理和分析、基因选择性剪接及其调控、微生物群落信息结构与功能分析等.