More info
  • ReceivedAug 9, 2020
  • AcceptedNov 16, 2020
  • PublishedJan 21, 2021


Funded by

the National Natural Science Foundation of China(31822052,31572381)

the National Thousand Youth Talents Plan

and the Program of the National Beef Cattle and Yak Industrial Technology System(CARS-37)


This work was supported by the National Natural Science Foundation of China (31822052, 31572381) and the National Thousand Youth Talents Plan, and the Program of the National Beef Cattle and Yak Industrial Technology System (CARS-37). We thank the High-Performance Computing platform of Northwest A&F University. We thank Yu Wang, Xiangyu Pan, Ming Li, Xiaomeng Tian, Dongke Zhou, Zhirui Yang, Han Xu, Chunna Cao and other members of the genome of big data laboratory for discussions. We also thank members of the NextGen project for sharing their data.

Interest statement

The author(s) declare that they have no conflict of interest. Animal care and the experiments were conducted according to the guidelines established by the Regulations for the Administration of Affairs Concerning Experimental Animals (Ministry of Science and Technology, China, 2004) and approved by the Institutional Animal Care and Use Committee (College of Animal Science and Technology, Northwest A&F University, China). Every effort was made to minimize animal pain, suffering, and distress and to reduce the number of animals used.



The supporting information is available online at https://doi.org/10.1007/s11427-020-1850-x. The supporting materials are published as submitted, without typesetting or editing. The responsibility for scientific accuracy and content remains entirely with the authors.


[1] Abecasis G.R., Altshuler D., Auton A., Brooks L.D., Durbin R.M., Gibbs R.A., Hurles M.E., McVean G.A.. A map of human genome variation from population-scale sequencing. Nature, 2010, 467: 1061-1073 CrossRef PubMed ADS Google Scholar

[2] Abecasis G.R., Auton A., Brooks L.D., DePristo M.A., Durbin R.M., Handsaker R.E., Kang H.M., Marth G.T., McVean G.A.. An integrated map of genetic variation from 1,092 human genomes. Nature, 2012, 491: 56-65 CrossRef PubMed ADS Google Scholar

[3] Ahlawat S., Sharma P., Sharma R., Arora R., De S.. Zinc finger domain of the PRDM9 gene on chromosome 1 exhibits high diversity in ruminants but its paralog PRDM7 contains multiple disruptive mutations. PLoS ONE, 2016, 11: e0156159 CrossRef PubMed ADS Google Scholar

[4] Alberto F.J., Boyer F., Orozco-terWengel P., Streeter I., Servin B., de Villemereuil P., Benjelloun B., Librado P., Biscarini F., Colli L., et al. Convergent genomic signatures of domestication in sheep and goats. Nat Commun, 2018, 9: 813 CrossRef PubMed ADS Google Scholar

[5] Auton A., Brooks L.D., Durbin R.M., Garrison E.P., Kang H.M., Korbel J.O., Marchini J.L., McCarthy S., McVean G.A., Abecasis G.R.. A global reference for human genetic variation. Nature, 2015, 526: 68-74 CrossRef PubMed ADS Google Scholar

[6] Baird P.N., Robman L.D., Richardson A.J., Dimitrov P.N., Tikellis G., McCarty C.A., Guymer R.H.. Gene-environment interaction in progression of AMD: the CFH gene, smoking and exposure to chronic infection. Hum Mol Genet, 2008, 17: 1299-1305 CrossRef PubMed Google Scholar

[7] Bickhart D.M., Hou Y., Schroeder S.G., Alkan C., Cardone M.F., Matukumalli L.K., Song J., Schnabel R.D., Ventura M., Taylor J.F., et al. Copy number variation of individual cattle genomes using next-generation sequencing. Genome Res, 2012, 22: 778-790 CrossRef PubMed Google Scholar

[8] Bickhart D.M., Xu L., Hutchison J.L., Cole J.B., Null D.J., Schroeder S.G., Song J., Garcia J.F., Sonstegard T.S., Van Tassell C.P., et al. Diversity and population-genetic properties of copy number variations and multicopy genes in cattle. DNA Res, 2016, 23: 253-262 CrossRef PubMed Google Scholar

[9] Busnelli M., Manzini S., Parolini C., Escalante-Alcalde D., Chiesa G.. Lipid phosphate phosphatase 3 in vascular pathophysiology. Atherosclerosis, 2018, 271: 156-165 CrossRef PubMed Google Scholar

[10] Chaisson M.J., Tesler G.. Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory. BMC Bioinf, 2012, 13: 238 CrossRef PubMed Google Scholar

[11] Chen L., Qiu Q., Jiang Y., Wang K., Lin Z., Li Z., Bibi F., Yang Y., Wang J., Nie W., et al. Large-scale ruminant genome sequencing provides insights into their evolution and distinct traits. Science, 2019, 364: eaav6202 CrossRef PubMed ADS Google Scholar

[12] Chen N., Cai Y., Chen Q., Li R., Wang K., Huang Y., Hu S., Huang S., Zhang H., Zheng Z., et al. Whole-genome resequencing reveals world-wide ancestry and adaptive introgression events of domesticated cattle in East Asia. Nat Commun, 2018, 9: 2337 CrossRef PubMed ADS Google Scholar

[13] Davis E., Jensen C.H., Schroder H.D., Farnir F., Shay-Hadfield T., Kliem A., Cockett N., Georges M., Charlier C.. Ectopic expression of DLK1 protein in skeletal muscle of padumnal heterozygotes causes the callipyge phenotype. Curr Biol, 2004, 14: 1858-1862 CrossRef PubMed Google Scholar

[14] de Filippo C., Key F.M., Ghirotto S., Benazzo A., Meneu J.R., Weihmann A., Parra G., Green E.D., Andrés A.M.. Recent selection changes in human genes under long-term balancing selection. Mol Biol Evol, 2016, 33: 1435-1447 CrossRef PubMed Google Scholar

[15] Dharmadhikari A.V., Kang S.H.L., Szafranski P., Person R.E., Sampath S., Prakash S.K., Bader P.I., Phillips J.A., Hannig V., Williams M., et al. Small rare recurrent deletions and reciprocal duplications in 2q21.1, including brain-specific ARHGEF4 and GPR148. Hum Mol Genet, 2012, 21: 3345-3355 CrossRef PubMed Google Scholar

[16] Dong Y., Zhang X., Xie M., Arefnezhad B., Wang Z., Wang W., Feng S., Huang G., Guan R., Shen W., et al. Reference genome of wild goat (Capra aegagrus) and sequencing of goat breeds provide insight into genic basis of goat domestication. BMC Genomics, 2015, 16: 431 CrossRef PubMed Google Scholar

[17] Elsik C.G., Tellam R.L., Worley K.C., Gibbs R.A., Muzny D.M., Weinstock G.M., Adelson D.L., Eichler E.E., Elnitski L., Guigó R., et al. The genome sequence of taurine cattle: a window to ruminant biology and evolution. Science, 2009, 324: 522-528 CrossRef PubMed ADS Google Scholar

[18] Floris C., Rassu S., Boccone L., Gasperini D., Cao A., Crisponi L.. Two patients with balanced translocations and autistic disorder: CSMD3 as a candidate gene for autism found in their common 8q23 breakpoint area. Eur J Hum Genet, 2008, 16: 696-704 CrossRef PubMed Google Scholar

[19] Fukumoto T., Zhu H., Nacarelli T., Karakashev S., Fatkhutdinov N., Wu S., Liu P., Kossenkov A.V., Showe L.C., Jean S., et al. N6-methylation of adenosine (m6A) of FZD10 mRNA contributes to PARP inhibitor resistance. Cancer Res, 2019, 79: 2812-2820 CrossRef PubMed Google Scholar

[20] Gao F., Ming C., Hu W., Li H.. New software for the fast estimation of population recombination rates (FastEPRR) in the genomic era. G3, 2016, 6: 1563-1571 CrossRef PubMed Google Scholar

[21] Gupta M.K., Vadde R.. Genetic basis of adaptation and maladaptation via balancing selection. Zoology, 2019, 136: 125693 CrossRef PubMed Google Scholar

[22] Hackmann T.J., Spain J.N.. Invited review: ruminant ecology and evolution: perspectives useful to ruminant livestock research and production. J Dairy Sci, 2010, 93: 1320-1334 CrossRef PubMed Google Scholar

[23] Hauswirth, R., Haase, B., Blatter, M., Brooks, S.A., Burger, D., Drogemuller, C., Gerber, V., Henke, D., Janda, J., Jude, R., et al. (2012). Mutations in MITF and PAX3 cause “splashed white” and other white spotting phenotypes in horses. PLoS Genet 8, e1002653. Google Scholar

[24] Hindorff L.A., Sethupathy P., Junkins H.A., Ramos E.M., Mehta J.P., Collins F.S., Manolio T.A.. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci USA, 2009, 106: 9362-9367 CrossRef PubMed ADS Google Scholar

[25] International HapMap, C. (2005). A haplotype map of the human genome. Nature 437, 1299–1320. Google Scholar

[26] Irvin M.R., Wineinger N.E., Rice T.K., Pajewski N.M., Kabagambe E.K., Gu C.C., Pankow J., North K.E., Wilk J.B., Freedman B.I., et al. Genome-wide detection of allele specific copy number variation associated with insulin resistance in African Americans from the HyperGEN study. PLoS ONE, 2011, 6: e24052 CrossRef PubMed ADS Google Scholar

[27] Jacobs L.C., Hamer M.A., Gunn D.A., Deelen J., Lall J.S., van Heemst D., Uh H.W., Hofman A., Uitterlinden A.G., Griffiths C.E.M., et al. A genome-wide association study identifies the skin color genes IRF4, MC1R, ASIP, and BNC2 influencing facial pigmented spots. J Invest Dermatol, 2015, 135: 1735-1742 CrossRef PubMed Google Scholar

[28] Johnsen J.M., Teschke M., Pavlidis P., McGee B.M., Tautz D., Ginsburg D., Baines J.F.. Selection on cis-regulatory variation at B4galnt2 and its influence on von Willebrand factor in house mice. Mol Biol Evol, 2009, 26: 567-578 CrossRef PubMed Google Scholar

[29] Kang H.M., Sul J.H., Service S.K., Zaitlen N.A., Kong S.Y., Freimer N.B., Sabatti C., Eskin E.. Variance component model to account for sample structure in genome-wide association studies. Nat Genet, 2010, 42: 348-354 CrossRef PubMed Google Scholar

[30] Kojo S., Tanaka H., Endo T.A., Muroi S., Liu Y., Seo W., Tenno M., Kakugawa K., Naoe Y., Nair K., et al. Priming of lineage-specifying genes by Bcl11b is required for lineage choice in post-selection thymocytes. Nat Commun, 2017, 8: 702 CrossRef PubMed ADS Google Scholar

[31] Kong Y., Zhao L., Charette J.R., Hicks W.L., Stone L., Nishina P.M., Naggert J.K.. An FRMD4B variant suppresses dysplastic photoreceptor lesions in models of enhanced S-cone syndrome and of Nrl deficiency. Hum Mol Genet, 2018, 27: 3340-3352 CrossRef PubMed Google Scholar

[32] Koren S., Rhie A., Walenz B.P., Dilthey A.T., Bickhart D.M., Kingan S.B., Hiendleder S., Williams J.L., Smith T.P.L., Phillippy A.M.. De novo assembly of haplotype-resolved genomes with trio binning. Nat Biotechnol, 2018, 36: 1174-1182 CrossRef PubMed Google Scholar

[33] Leffler E.M., Gao Z., Pfeifer S., Ségurel L., Auton A., Venn O., Bowden R., Bontrop R., Wall J.D., Sella G., et al. Multiple instances of ancient balancing selection shared between humans and chimpanzees. Science, 2013, 339: 1578-1582 CrossRef PubMed ADS Google Scholar

[34] Li Y., Chu J., Feng W., Yang M., Zhang Y., Zhang Y., Qin Y., Xu J., Li J., Vasilatos S.N., et al. EPHA5 mediates trastuzumab resistance in HER2-positive breast cancers through regulating cancer stem cell-like properties. FASEB J, 2019, 33: 4851-4865 CrossRef PubMed Google Scholar

[35] Liu G.E., Ventura M., Cellamare A., Chen L., Cheng Z., Zhu B., Li C., Song J., Eichler E.E.. Analysis of recent segmental duplications in the bovine genome. BMC Genomics, 2009, 10: 571 CrossRef PubMed Google Scholar

[36] Liu G.E., Hou Y., Zhu B., Cardone M.F., Jiang L., Cellamare A., Mitra A., Alexander L.J., Coutinho L.L., Dell’Aquila M.E., et al. Analysis of copy number variations among diverse cattle breeds. Genome Res, 2010, 20: 693-703 CrossRef PubMed Google Scholar

[37] Lv F.H., Agha S., Kantanen J., Colli L., Stucki S., Kijas J.W., Joost S., Li M.H., Ajmone Marsan P.. Adaptations to climate-mediated selective pressures in sheep. Mol Biol Evol, 2014, 31: 3324-3343 CrossRef PubMed Google Scholar

[38] Ma L., O’Connell J.R., VanRaden P.M., Shen B., Padhi A., Sun C., Bickhart D.M., Cole J.B., Null D.J., Liu G.E., et al. Cattle sex-specific recombination and genetic control from a large pedigree analysis. PLoS Genet, 2015, 11: e1005387 CrossRef PubMed Google Scholar

[39] Ma Y., Chen C., Wang Y., Wu L., He F., Chen C., Zhang C., Deng X., Yang L., Chen Y., et al. Analysis copy number variation of Chinese children in early-onset epileptic encephalopathies with unknown cause. Clin Genet, 2016, 90: 428-436 CrossRef PubMed Google Scholar

[40] Nattestad M., Schatz M.C.. Assemblytics: a web analytics tool for the detection of variants from an assembly. Bioinformatics, 2016, 32: 3021-3023 CrossRef PubMed Google Scholar

[41] Naval-Sanchez M., Nguyen Q., McWilliam S., Porto-Neto L.R., Tellam R., Vuocolo T., Reverter A., Perez-Enciso M., Brauning R., Clarke S., et al. Sheep genome functional annotation reveals proximal regulatory elements contributed to the evolution of modern breeds. Nat Commun, 2018, 9: 859 CrossRef PubMed ADS Google Scholar

[42] Norris B.J., Whan V.A.. A gene duplication affecting expression of the ovine ASIP gene is responsible for white and black sheep. Genome Res, 2008, 18: 1282-1293 CrossRef PubMed Google Scholar

[43] Patterson N., Price A.L., Reich D.. Population structure and eigenanalysis. PLoS Genet, 2006, 2: e190 CrossRef PubMed Google Scholar

[44] Perry G.H., Tchinda J., McGrath S.D., Zhang J., Picker S.R., Cáceres A.M., Iafrate A.J., Tyler-Smith C., Scherer S.W., Eichler E.E., et al. Hotspots for copy number variation in chimpanzees and humans. Proc Natl Acad Sci USA, 2006, 103: 8006-8011 CrossRef PubMed ADS Google Scholar

[45] Petrovski S., Wang Q., Heinzen E.L., Allen A.S., Goldstein D.B.. Genic intolerance to functional variation and the interpretation of personal genomes. PLoS Genet, 2013, 9: e1003709 CrossRef PubMed Google Scholar

[46] Quinlan A.R., Hall I.M.. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics, 2010, 26: 841-842 CrossRef PubMed Google Scholar

[47] Raman S., Beilschmidt M., To M., Lin K., Lui F., Jmeian Y., Ng M., Fernandez M., Fu Y., Mascall K., et al. Structure-guided design fine-tunes pharmacokinetics, tolerability, and antitumor profile of multispecific frizzled antibodies. Proc Natl Acad Sci USA, 2019, 116: 6812-6817 CrossRef PubMed Google Scholar

[48] Redon R., Ishikawa S., Fitch K.R., Feuk L., Perry G.H., Andrews T.D., Fiegler H., Shapero M.H., Carson A.R., Chen W., et al. Global variation in copy number in the human genome. Nature, 2006, 444: 444-454 CrossRef PubMed ADS Google Scholar

[49] Reimann F., Ashcroft F.M.. Inwardly rectifying potassium channels. Curr Opin Cell Biol, 1999, 11: 503-508 CrossRef Google Scholar

[50] Repping S., van Daalen S.K.M., Brown L.G., Korver C.M., Lange J., Marszalek J.D., Pyntikova T., van der Veen F., Skaletsky H., Page D.C., et al. High mutation rates have driven extensive structural polymorphism among human Y chromosomes. Nat Genet, 2006, 38: 463-467 CrossRef PubMed Google Scholar

[51] Sarowar T., Chhabra R., Vilella A., Boeckers T.M., Zoli M., Grabrucker A.M.. Activity and circadian rhythm influence synaptic Shank3 protein levels in mice. J Neurochem, 2016, 138: 887-895 CrossRef PubMed Google Scholar

[52] Schmittgen T.D., Livak K.J.. Analyzing real-time PCR data by the comparative CT method. Nat Protoc, 2008, 3: 1101-1108 CrossRef PubMed Google Scholar

[53] Sedlazeck F.J., Rescheneder P., Smolka M., Fang H., Nattestad M., von Haeseler A., Schatz M.C.. Accurate detection of complex structural variations using single-molecule sequencing. Nat Methods, 2018, 15: 461-468 CrossRef PubMed Google Scholar

[54] Ségurel L., Thompson E.E., Flutre T., Lovstad J., Venkat A., Margulis S.W., Moyse J., Ross S., Gamble K., Sella G., et al. The ABO blood group is a trans-species polymorphism in primates. Proc Natl Acad Sci USA, 2012, 109: 18493-18498 CrossRef PubMed ADS arXiv Google Scholar

[55] Sharp A.J., Locke D.P., McGrath S.D., Cheng Z., Bailey J.A., Vallente R.U., Pertz L.M., Clark R.A., Schwartz S., Segraves R., et al. Segmental duplications and copy-number variation in the human genome. Am J Hum Genets, 2005, 77: 78-88 CrossRef PubMed Google Scholar

[56] Shenoy A.R., Wellington D.A., Kumar P., Kassa H., Booth C.J., Cresswell P., MacMicking J.D.. GBP5 promotes NLRP3 inflammasome assembly and immunity in mammals. Science, 2012, 336: 481-485 CrossRef PubMed ADS Google Scholar

[57] Siewert K.M., Voight B.F.. BetaScan2: Standardized statistics to detect balancing selection utilizing substitution data. Genome Biol Evol, 2020, 12: 3873-3877 CrossRef PubMed Google Scholar

[58] Simpson J.K., Martinez-Queipo M., Onoufriadis A., Tso S., Glass E., Liu L., Higashino T., Scott W., Tierney C., Simpson M.A., et al. Genotype-phenotype correlation in a large English cohort of patients with autosomal recessive ichthyosis. Br J Dermatol, 2020, 182: 729-737 CrossRef PubMed Google Scholar

[59] Singhal S., Leffler E.M., Sannareddy K., Turner I., Venn O., Hooper D.M., Strand A.I., Li Q., Raney B., Balakrishnan C.N., et al. Stable recombination hotspots in birds. Science, 2015, 350: 928-932 CrossRef PubMed Google Scholar

[60] Smyth G.K., Speed T.. Normalization of cDNA microarray data. Methods, 2003, 31: 265-273 CrossRef Google Scholar

[61] Snider J., Thibault G., Houry W.A.. The AAA+ superfamily of functionally diverse proteins. Genome Biol, 2008, 9: 216 CrossRef PubMed Google Scholar

[62] Stamatakis A.. RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics, 2014, 30: 1312-1313 CrossRef PubMed Google Scholar

[63] Sudmant P.H., Mallick S., Nelson B.J., Hormozdiari F., Krumm N., Huddleston J., Coe B.P., Baker C., Nordenfelt S., Bamshad M., et al. Global diversity, population stratification, and selection of human copy-number variation. Science, 2015, 349: aab3761 CrossRef PubMed Google Scholar

[64] Sung Y.J., Pérusse L., Sarzynski M.A., Fornage M., Sidney S., Sternfeld B., Rice T., Terry J.G., Jacobs Jr D.R., Katzmarzyk P., et al. Genome-wide association studies suggest sex-specific loci associated with abdominal and visceral fat. Int J Obes, 2016, 40: 662-674 CrossRef PubMed Google Scholar

[65] Taberlet P., Coissac E., Pansu J., Pompanon F.. Conservation genetics of cattle, sheep, and goats. Compt Rend Biol, 2011, 334: 247-254 CrossRef PubMed Google Scholar

[66] Tajima, F. (1989). Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics 123, 585–595. Google Scholar

[67] Thiesen S., Kübart S., Ropers H.H., Nothwang H.G.. Isolation of two novel human RhoGEFs, ARHGEF3 and ARHGEF4, in 3p13-21 and 2q22. Biochem Biophys Res Commun, 2000, 273: 364-369 CrossRef PubMed Google Scholar

[68] Vavvas D.G., Small K.W., Awh C.C., Zanke B.W., Tibshirani R.J., Kustra R.. CFH and ARMS2 genetic risk determines progression to neovascular age-related macular degeneration after antioxidant and zinc supplementation. Proc Natl Acad Sci USA, 2018, 115: E696-E704 CrossRef PubMed Google Scholar

[69] Vilà C., Seddon J., Ellegren H.. Genes of domestic mammals augmented by backcrossing with wild ancestors. Trends Genets, 2005, 21: 214-218 CrossRef PubMed Google Scholar

[70] Walter K., Min J.L., Huang J., Crooks L., Memari Y., McCarthy S., Perry J.R.B., Xu C.J., Futema M., Lawson D., et al. The UK10K project identifies rare variants in health and disease. Nature, 2015, 526: 82-90 CrossRef PubMed ADS Google Scholar

[71] Wang B., Chen L., Wang W.. Genomic insights into ruminant evolution: from past to future prospects. Zool Res, 2019a, 40: 476-487 CrossRef PubMed Google Scholar

[72] Wang K., Li M., Hakonarson H.. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res, 2010, 38: e164 CrossRef PubMed Google Scholar

[73] Wang X., Liu J., Niu Y., Li Y., Zhou S., Li C., Ma B., Kou Q., Petersen B., Sonstegard T., et al. Low incidence of SNVs and indels in trio genomes of Cas9-mediated multiplex edited sheep. BMC Genomics, 2018, 19: 397 CrossRef PubMed Google Scholar

[74] Wang X., Zheng Z., Cai Y., Chen T., Li C., Fu W., Jiang Y.. CNVcaller: highly efficient and widely applicable software for detecting copy number variations in large populations. GigaScience, 2017, 6 CrossRef PubMed Google Scholar

[75] Wang Y., Zhang C., Wang N., Li Z., Heller R., Liu R., Zhao Y., Han J., Pan X., Zheng Z., et al. Genetic basis of ruminant headgear and rapid antler regeneration. Science, 2019b, 364: eaav6335 CrossRef PubMed ADS Google Scholar

[76] Wang Y.H., Reverter A., Kemp D., McWilliam S.M., Ingham A., Davis C.A., Moore R.J., Lehnert S.A.. Gene expression profiling of Hereford Shorthorn cattle following challenge with Boophilus microplus tick larvae. Aust J Exp Agric, 2007, 47: 1397-1407 CrossRef Google Scholar

[77] Wu J., Saupe S.J., Glass N.L.. Evidence for balancing selection operating at the het-c heterokaryon incompatibility locus in a group of filamentous fungi. Proc Natl Acad Sci USA, 1998, 95: 12398-12403 CrossRef PubMed ADS Google Scholar

[78] Wu Q., Han T.S., Chen X., Chen J.F., Zou Y.P., Li Z.W., Xu Y.C., Guo Y.L.. Long-term balancing selection contributes to adaptation in Arabidopsis and its relatives. Genome Biol, 2017, 18: 217 CrossRef PubMed Google Scholar

[79] Xu J., Shetty P.B., Feng W., Chenault C., Bast Jr R.C., Issa J.P.J., Hilsenbeck S.G., Yu Y.. Methylation of HIN-1, RASSF1A, RIL and CDH13 in breast cancer is associated with clinical characteristics, but only RASSF1A methylation is associated with outcome. BMC Cancer, 2012, 12: 243 CrossRef PubMed Google Scholar

[80] Yang J., Niu H., Huang Y., Yang K.. A systematic analysis of the relationship of CDH13 promoter methylation and breast cancer risk and prognosis. PLoS ONE, 2016, 11: e0149185 CrossRef PubMed ADS Google Scholar

[81] Yang Z.. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol, 2007, 24: 1586-1591 CrossRef PubMed Google Scholar

[82] Zhang F., Gu W., Hurles M.E., Lupski J.R.. Copy number variation in human health, disease, and evolution. Annu Rev Genom Hum Genet, 2009, 10: 451-481 CrossRef PubMed Google Scholar

[83] Zhang X., Xu Y., Liu D., Geng J., Chen S., Jiang Z., Fu Q., Sun K.. A modified multiplex ligation-dependent probe amplification method for the detection of 22q11.2 copy number variations in patients with congenital heart disease. BMC Genomics, 2015a, 16: 364 CrossRef PubMed Google Scholar

[84] Zhang Z., Li C., Wu F., Ma R., Luan J., Yang F., Liu W., Wang L., Zhang S., Liu Y., et al. Genomic variations of the mevalonate pathway in porokeratosis. eLife, 2015b, 4: e06322 CrossRef PubMed Google Scholar

[85] Zuk O., Schaffner S.F., Samocha K., Do R., Hechter E., Kathiresan S., Daly M.J., Neale B.M., Sunyaev S.R., Lander E.S.. Searching for missing heritability: designing rare variant association studies. Proc Natl Acad Sci USA, 2014, 111: E455-E464 CrossRef PubMed ADS Google Scholar

  • Figure 1

    Analysis pipeline. Diagram of the pipeline used to identify species-shared CNVRs. There may be multiple shared regions with similar ranges because of the similarity of the alignment sequence. We only use the merged region for subsequent analysis.

  • Figure 2

    CNVR comparison among genomes of cattle, goat, and sheep. A, Geographic distribution of 886 genomes of ruminant livestock. B, Distribution of CNVR in genomic region based on the size of all CNVR sites. C, Interspecific comparison of CNVRs. For each species, the proportion of its CNVRs shared with none, one or two other species is plotted. The genome sequences of cattle and sheep were compared to the autosomal genome sequences of goat ARS1, and some of the sequences not matched were not involved in the calculation.

  • Figure 3

    (Color online) Validation and evaluation of genotypes by CNVplex. Three deletion (A–C) and three duplication (D–F) CNVRs with distinguishable copy numbers were genotyped using CNVplex in 56 sheep samples. The copy-number genotype from the same animal as were predicted by CNVcaller and CNVplex; the two methods showed genotype concordance at confidently called sites. See Table S5 in Supporting Information for CNVplex confirmation results for 44 CNVRs in 56 sheep.

  • Figure 4

    Genome-wide screening and functional annotations of selected CNVRs in cattle (A), goat (B), and sheep (C). VST is plotted against the position on each of the 27 (sheep) or 30 (goat) chromosomes. The horizontal solid lines indicate the genome-wide threshold of selection signals, with the highly stratified genes having VST values≥0.15. Selected CNVRs that overlapped with characterized GWAS loci are shown in Latitude (°), PR (mm), ATMP (°C), AMIT (°C), AMAT (°C), DTMM (°C), ASD (h). Three GWAS results that overlapped with the strong selective CNVs are shown. The Bonferroni significance threshold (cattle: 1.04×10−5, goat: 4.21×10−5, sheep: 2.18×10−5) is indicated by red horizontal dashed lines.

  • Figure 5

    Allele frequency distribution of CNVRs and SNPs number among three species. A, With the increasing frequency of the allele, the number of CNVRs present a different downward trend. B, With the increasing frequency of the allele, the number of SNPs present a different downward trend. C, Venn diagram showing unique and shared CNVR (bp) among the three species. D, Random distribution of shared CNVRs by the assumed model, the observed species-shared CNVR is significantly higher than what is expected by 300 simulations (**, P<1.0×10−2). E, Venn diagram showing unique and shared SNPs among the three species, the digit in brackets represent the total number of SNPs with the same location and genotype.

  • Figure 6

    Heatmaps of two CNVRs around ASIP region in goat and sheep. The heatmaps depict two CNVRs under candidate balancing selection with the same starting and ending positions of ASIP gene in the sheep and goat genomes. The X-axis values indicate the position and the Y-axis values indicate the sample count and different populations. Red solid line: the ASIP gene and its upstream and downstream genes, and this region correspond to the goat and sheep genome, respectively; Green shadow region: the ~40 kb shared CNVR distribution of goat and sheep genomes; Yellow shadow region: the ~190 kb shared CNVR distribution of goat and sheep genomes (copy 1), there were two ~190 kb CNVRs in the sheep genome; Pink shadow region: the ~190 kb shared CNVR distribution of goat and sheep genomes (copy 2), there were two ~190 kb CNVRs in the sheep genome.

  • Table 1   CNVRs differentiated between ruminant livestock populationsa)






    Size (kb)




    Copy range

    GWAS (P-valve)

    Function description









    BI: LH vs. HH (0.35)



    Hybrid sterility gene (Ahlawat et al., 2016; Ma et al., 2015).








    BT: LH vs. HH (0.18)



    Subcutaneous fat (Sung et al., 2016).








    BI: LH vs. HH (0.25); BT: LH vs. HH (0.37); All cattle: LH vs. HH (0.22)



    KCNJ12 was reported to regulate insulin secretion (Reimann and Ashcroft, 1999).








    BT vs. BI (0.46); BT vs. BB (0.21)



    Tick resistance in cattle (Bickhart et al., 2012; Wang et al., 2007).









    LH vs. HH (0.19)



    ARHGEF4 is involved in several metabolic processes (Thiesen et al., 2000). The CNVs of this gene are associated with delayed development and neural alterations (Dharmadhikari et al., 2012), as well as resistance to insulin (Irvin et al., 2011).








    DOA vs. WOA (0.20)


    DTMM: 6.14×10−6; NDPR: 6.14×10−6

    CUB and Sushi multiple domains 3 (CSMD3) as a candidate gene for autism found in their common 8q23 breakpoint area (Sarowar et al., 2016).








    DOA vs. WOA (0.18)


    PR: 1.07×10−5

    GBP5 promotes NLRP3 inflammasome assembly and immunity in mammals (Kojo et al., 2017).







    LOC101118284 (dist=172,535), PCDH20 (dist=501,727)

    DOA vs. WOA (0.17)


    PR: 6.78×10−6

    olfactory receptor 5W2-like.








    BEGAIN (dist=145,698), DLK1 (dist=10,651)

    CA vs. CI (0.29)


    DTMM: 6.72×10−6; PR: 2.65×10−8

    Ectopic expression of DLK1 protein in skeletal muscle of Padumnal Heterozygotes causes the callipyge phenotype (Davis et al., 2004).








    CA vs. CI (0.65); CH vs. CI (0.53)


    AMIT: 2.54×10−5

    Recessive forms of congenital ichthyosis encompass a group of rare inherited disorders of keratinization leading to dry, scaly skin (Simpson et al., 2020).








    CA vs. CI (0.98); CH vs. CI (0.86)


    AMAT: 1.37×10−6; AMIT: 1.17×10−6; ATMP: 9.09×10−7

    Lipid phosphate phosphatase 3 in vascular pathophysiology (Busnelli et al., 2018).







    MITF (dist=223,910), FRMD4B (dist=142,279)

    CA vs. CI (0.98); CH vs. CI (0.86)


    PR: 5.41×10−7; AMIT: 1.13×10−6; ATMP: 5.35×10−7

    MITF gene that cause the splashed white phenotype in Horses (Hauswirth et al., 2012). FRMD4B variant suppressesdysplastic photoreceptor lesions(Kong et al., 2018).

    BT: Bos taurus; BI: Bos indicus. BB: Bos taurus×Bos indicus. BG: Bos grunniens. DOA: Domestic Ovis aries. WOA: Wild Ovis aries; including Ovis musimon (mouflon), Ovis ammon (Argali), Ovis canadensis (Bighorn Sheep), and Ovis dalli (Thinhorn sheep). CA: Capra aegagrus. CI: including Capra falconeri (Markhor), Capra sibirica (Siberian ibex), and Capra ibex (Alpine ibex). CH: Capra hircus. LH: Low-height. HH: High-height. CNVRs intersecting genes that show a dramatic difference in copy number (as measured by VST and GWAS) between ruminant livestock populations. See Table S1 in Supporting Information for the definition of populations and the abbreviation of environmental parameters.