The Evolution of whole genome DNA methylation pattern in vertebrates
DNA methylation plays a role in a variety of biological processes including embryonic development, imprinting, X-chromosome inactivation, and stem cell differentiation. Aberrant DNA methylation has been frequently reported to influence gene expression and subsequently cause various human diseases, especially cancer. One of our main interests is the evolution of whole genome methylation pattern in vertebrates, and how DNA methylation pattern affects genome evolution. In our previous studies, we found a complementary regulation between DNA methylation (transcriptional level) and miRNA function (posttranscriptional level) in the human genome (Su et al. BMC Genomics, 2013; Su et al. Epigenetics, 2013); we used the whole-genome bisulfite-sequencing technology to investigate the genome-wide methylation in sea lamprey, and found that the whole genome methylation pattern transition from invertebrates to vertebrates has been a gradual process, rather than an immediate, episodic event (Zhang et al. 2016).
Evo-Devo of human heart
During the evolution from chordates to mammals, the heart evolved from a single-layered tube with peristaltic contractility to a more efficient and powerful pump with thick muscular chambers dedicated to receiving (atrial) and pumping (ventricular) blood, displaying synchronous contractions and seamless connections to a closed vascular system. There is still lack of systematic study of the evolution of gene expression patterns in this process. We are especially interested in figuring out how heart developmental regulation systems evolve. One particular aspect that we study intensively is how gene and genome duplications contributed to this evolutionary process. To address these questions, we analyzed the heart transcriptome data of Sea squirt, zebrafish, sea lamprey, tilapia, japanese pufferfish, green spotted pufferfish generated by the next generation sequencing technology, and public available heart transcriptome data of other 10 species. Our studies may provide new isngihts into the evolution of vertebrate heart development, and also provide a lot of new resources for further study.
The transcriptome evolution
Gene expression changes may underlie much of phenotypic evolution. Recent innovation of RNA-seq technology has shed new insights into the transcriptomic evolution, especially on perspective of tissue-specific expression evolution. We are interested in the development of methods and tools to analyze RNA-seq based gene expression data and integrate them in evolutionary models that reflect the regulatory wiring and modularity of biological systems, and ultimately to address outstanding questions in the evolution of complex phenotypes.
To maintain normal physiological functions, different tissues may have different developmental constraints on expressed genes. we develop a stochastic model for genomic evolution under the principle of stabilizing selection and formulate the tissue-driven hypothesis by postulating that stabilizing selections for both expression and sequence divergences may be affected simultaneously by the common factors of tissues in which the genes is expressed. Facilitated by substantial multispecies microarrays, we test several predicted genomic correlations from the tissue-driven hypothesis.We conclude that tissue factors should be considered as an important component in shaping the pattern of genomic evolution and correlations (Gu and Su, PNAS，2007; Su et al. Ann Biomed Eng. 2007).
We studied the divergence of upstream and downstream regulatory networks between duplicate TFs in light of the ENCODE project. We found an asymmetric evolution of upstream and downstream regulatory circuits between duplicate TFs (Zhou et al. Molecular Biology and Evolytion, 2014). We have developed an R package TreeExp that can perform comparative expression evolution analysis based on RNA-seq count data (https://github.com/hr1912/TreeExp) (Gu et al. Genome Biology and Evolution, 2013; Ruan et al. submitted).
Cancer Evolutionary Genomcis
Cancer cells evolve through random somatic mutations and epigenetic changes that may alter several crucial pathways, a process that is followed by the clonal selection of the resulting cells. Consequently, cancer cells can survive and proliferate under deleterious circumstances. Understanding these dynamic evolutionary processes will benefit our understanding of the occurrence and progression of cancer as well as cancer diagnosis and therapy. We are interested in understanding cancer genomes from the perspective of identifying driver alterations and how cancer cells evolve over time. Our work involves the development of statistical models and computational approaches to analyze large, high dimensional genomics and epigenomics data sets derived from tumours in order to describe mutational landscapes of cancer subtypes and quantify clonal evolution and intratumoural heterogeneity.
The evolution of alternative splicing after gene duplication
Alternative splicing and gene duplication are two major sources of proteomic function diversity. We conducted a comprehensive analysis to investigate the evolutionary pattern of alternative splicing after gene duplications. We observed that duplicate genes have fewer alternative splice (AS) forms than single-copy genes, and that a negative correlation exists between the mean number of AS forms and the gene family size. Interestingly, we found that the loss of alternative splicing in duplicate genes may occur shortly after the gene duplication. We therefore conclude that alternative splicing and gene duplication may not evolve independently and proposed a “function sharing model” to explain our observation (Su et al. 2006; Su ang Gu, 2012). Our paper has been introduced in ‘Leading Edge’ section of cell journal (Cell, 2006, 124: 869).
Gene duplication and the evolution of gene network
How gene duplication has influenced the evolution of gene networks is one of the core problems in evolution. We constructed a model system consisting of human G protein-coupled receptors (GPCRs) and their downstream genes in the GPCR pathways. we found that the position of a gene in the gene networks has significant influences on the likelihood of fixation of its duplicates. However, for a super gene family, the influence was not uniform among subfamilies. For super families, such as GPCRs, whose gene basis of expression diversity was well established at early vertebrates, continued expansions were mostly prominent in particular small subfamilies mainly involved in lineage-specific functions.
The contribution of gene duplication on the vertebrate morphological complexity
It is of great interests to know the underlying genetic basis for the evolutionary rise of multicellularity and internal complexity in vertebrates, which largely remains mysterious. We estimate the age distribution of vertebrate miRNA duplication events, showing the evolutionary scenario that gene/genome duplications in the early stage of vertebrates may expand the protein-encoding genes and miRNAs simultaneously. We further speculate that genetically lying behind the evolution of vertebrate complexity may be the proteome doubling and alterations of the epigenetic (including miRNA) machinery.
Genetic buffering, and functional compensation between mouse duplicates
Knocking out a gene in an organism often has little phenotypic effect, owing to two mechanisms: the existence of duplicate genes, and genetic buffering of network. Functional compensation of duplicate (paralogous) genes has been proved to play an important role in genetic robustness in both yeast and nematode. However, the role and magnitude of the duplicate genes contributing to genetic robustness in mammals remain controversial. We conducted an extensive and careful analysis and corroborated a strong effect of duplicate genes on mouse genetics robustness. Moreover, the effect of duplicate genes on mouse genetic robustness is duplication-age dependent, which holds after ruling out the potential confound effects (Su and Gu, JME, 2008). We currently using the konckout phynotypic data of some model organisms to study the evolution of gene essentiality.
47. Wang J, Chen X, He F,.....Su Z*, Wang C*. A developmental transcriptome atlas of Chinese mitten crab Eriocheir sinensis. Under review.
46. Wei Z, He F,......Ma H*, Su Z*, Liu Q*. Unexpected CRISPR off-target mutation pattern in vivo are not typically germline-like. bioRxiv, 2017:193565. doi: https://doi.org/10.1101/193565
45. Zhou Z, Zou Y, Liu G, Zhou J, Zhao S, Su Z*, Gu X*. Mutation-Profile-Based Methods for Understanding Selection Forces in Cancer Somatic Mutations: A Comparative Analysis. Oncotarget. 8(35): 58835–58846.
44. Zhou Z, Lyu X,Wu J, Yang X, Wu S, Zhou J, Gu X, Su Z*, Chen S*. 2017. TSNAD: an integrated software for cancer somatic mutation and tumour-specific neoantigen detection. R. Soc. open sci. 4: 170050. http://dx.doi.org/10.1098/rsos.170050
43. Tong K, Wang Y, Su Z*. 2017. Phosphotyrosine Signaling and the Origin of Animal Multicellularity. Proceedings of the Royal Society B: Biological Sciences. 284: 20170681 (Review article)
42. Sa Z, Zhou J, Zou Y*, Su Z*, Xun Gu*. 2017. Paralog Diverged Features May Help Reduce Off-target Effects of Drugs: Hints from Glucagon Subfamily Analysis. Genomics, Proteomics & Bioinformatics. pii: S1672-0229(17)30084-0. doi: 10.1016/j.gpb.2017.03.004.
41. Gu X.*, Ruan H., Su Z. and Zou Y. 2017. Brownian model of transcriptome evolution and phylogenetic network visualization between tissues. Mol Phylogenet Evol 114: 34-39.
40. Wang Y, Su Z*, Gu X*. 2017. What is the main mechanism of the origin of phosphorylation sites? Still an open question. Journal of Systematics and Evolution. 55 (3), 231-234.
39. Zhou Z, Lyu X,Wu J, Yang X, Wu S, Zhou J, Gu X, Su Z*, Chen S*. 2017. TSNAD: an integrated software for cancer somatic mutation and tumour-specific neoantigen detection. R. Soc. open sci. 4: 170050. http://dx.doi.org/10.1098/rsos.170050
38. Wang Y*, Tao X, Su Z, Liu A, Liu T, Sun L, Yao Q, Chen K, Gu X. 2016. Current bacterial gene encoding capsule biosynthesis protein CapI contains nucleotides derived from exonization. Evolutionary Bioinformatics12: 303–312.
37. Ruan H, Su Z*, Gu X*. 2016. TREEEXP1.0: R Package for Analyzing Expression Evolution Based on RNA-Seq Data. Journal of experimental Zoology Part B. 326(7):394-402
36. Zhang Z, Liu G, Zhou Y, et al. Genome-wide and single-base resolution DNA methylomes of the Sea Lamprey (Petromyzon marinus) Reveal Gradual Transition of the Genomic Methylation Pattern in Early Vertebrates[J]. bioRxiv, 2015: 033233.
35. Zhou Z, Zou Y, Liu G, et al. A New Mutation-Profile-Based Method for Understanding the Evolution of Cancer Somatic Mutations[J]. bioRxiv, 2015: 021147.
34. Shen L, Liu G, Zou Y, Zhou Z, Su Z, Gu X (2015) The Evolutionary Panorama of Organ-Specifically Expressed or Repressed Orthologous Genes in Nine Vertebrate Species. PLoS ONE 10(2): e0116872. doi:10.1371/journal.pone.0116872
33. Han L, Guo Y, Su Z, Zheng S, Lu Z. Advances in Computational Genomics. BioMed Research International. 2015 (2015), Article ID 187803 (Editorial)
32. Zhou Z, Zhou J, Su Z*, Gu X*. 2014. Asymmetric Evolution of Human Transcription Factor Regulatory Networks. Molecular Biology and Evolution. 31(8):2149–2155
31. Su Z*, Wang J, Gu X*. 2014. Effect of Duplicate Genes on Mouse Genetic Robustness: An Update. BioMed Research International, doi:10.1155/2014/758672
30. Liu G†, Zou Y†, Cheng Q, Zeng Y, Gu X, Su Z*. 2014. Age distribution patterns of human gene families: divergent for gene ontology categories and concordant between different subcellular localizations. Molecular Genetics and Genomics. 289 (2), 137-147
29. Gu X*, Zou Y, Su Z, Huang W, Zhou Z, Arendsee Z, Zeng Y. 2013. An Update of DIVERGE Software for Functional Divergence Analysis of Protein Family. Molecular Biology and Evolution. 30 (7), 1713-1719
28. Gu X*, Zou Y, Huang W, Shen L, Arendsee Z, Su Z*. 2013. Phylogenomic Distance Method for Analyzing Transcriptome Evolution based on RNA-seq Data. Genome Biology and Evolution. 5(9):1746-53.
27. Wang E, Sun S, Qiao B, Duan W, Huang G, An Y, Xu S, Zheng Y, Su Z. et al. 2013. Identification of Functional Mutations in GATA4 in Patients with Congenital Heart Disease. PloS one. 8 (4), e62138
26. Gu, X*., Zou, Y. and Su, Z. 2012. Gene Duplication and Functional Consequences. Y.Y. Shugart (ed.), Applied Computational Genomics, Translational Bioinformatics 1, DOI 10.1007/978-94-007-5558-1_9, Springer Science and Business Media, Dordrecht (Book Chapter)
25. Zou Y., Su Z. Huang W. and Gu, X*. 2012. Histone modification pattern evolution after yeast gene duplication. BMC Evol.Biol. 12: 111
24. Chen W., Su Z*. and Gu, X*. 2012. A note on gene pleiotropy estimation from phylogenetic analysis of protein sequences. Journal of Systematics and Evolution. doi: 10.1111/j.1759-6831.2012.00217.x
23. Su, Z. and Gu, X*. 2012. Revisit on the evolutionary relationship between alternative splicing and gene duplication. Gene. 504(1):102-6.
22. Su, Z., Huang, W and Gu, X*. 2011. Comment on "Positive Selection of Tyrosine Loss in Metazoan Evolution". Science. 332(6032):917.
21. Su, Z†., Xia, J†. and Zhao, Z*. 2011. Functional complementation between transcriptional methylation regulation and post-transcriptional microRNA regulation in the human genome. BMC Genomics. 12(Suppl 5):S15.
20. Su Z†, Xia J† and Zhao Z*. 2011. Do microRNAs preferentially target the genes with low DNA methylation level at the promoter region? Lecture Notes in Bioinformatics (LNBI). 6840:253-258
19. Su Z., Han L. and Zhao Z*. 2011. Conservation and divergence of DNA methylation in eukaryotes: new insights from single base-resolution DNA methylomes. Epigenetics 6(2): 134-140.
18. Su, Z†., Zeng, Y† and Gu, X*. 2010. A preliminary analysis of gene pleiotropy estimated from protein sequences. JEZ Part B: Molecular and Developmental Evolution. 314(2):115-22.
17. Zou, Y†., Su, Z†., Yang, J., Zeng, Y., Gu, X*. 2009. Uncovering Genetic Regulatory Network Divergence between Duplicate Genes Using Yeast eQTL Landscape. JEZ Part B: Molecular and Developmental Evolution. 312: 722-733
16. Su Z., Xu L., Gu Z and Gu X.* 2009. Origins of digestive RNases in leaf monkeys are an open question. Mol Phylogenet Evol. (Reply) 53:610-611.
15. Cao, J., Huang, S., Yi P., Qian J., Jin L., Su Z. et al., 2009. Evolution of the class C GPCR Venus flytrap modules involved positive selected functional divergence. BMC Evolutionary Biology. 9:67.
14. Huang Y., Zheng Y., Su Z* and Gu X*., 2009, Differences in duplication age distributions between human GPCRs and their downstream genes from a network prospective. BMC Genomics 10 (Suppl 1): S14.
13. Xu L†., Su Z†., Gu Z Gu X. 2009. Evolution of RNases in leaf monkeys: Being parallel gene duplications or parallel gene conversions is a problem of molecular phylogeny. Molecular Phylogenetics and Evolution, 50:397-400
12. Chen Q, Su Z, Zhong Y, and Gu, X*. 2009. Effect of site-specific heterogeneous evolution on phylogenetic reconstruction: A simple evaluation. Gene. 441(1-2):156-62
11. Gu X*., Su Z*., and Huang Y., 2009. Simultaneous Expansions of MicroRNAs and Protein-coding Genes by Gene/Genome Duplications in Early Vertebrates. JEZ Part B: Molecular and Developmental Evolution. 312(3):164-170.
10. Su, Z. and Gu, X*. 2008. Predicting the Proportion of Essential Genes in Mouse Duplicates based on Biased Mouse Knockout Genes. J Mol Evol. 67(6):705-709.
9. Gu, X*. and Su, Z. 2007. Tissue-Driven Hypothesis of Genomic Evolution and Sequence-Expression Correlations. PNAS 104: 2779-2784
8. Su, Z†., Huang, Y† and Gu, X*. 2007. Tissue-driven hypothesis with Gene Ontology (GO) Analysis. Ann Biomed Eng. 35(6):1088-94
7. Su Z†, Wang J†, Yu J, Huang X and Gu X*. (2006) Evolution of Alternative Splicing after Gene Duplication. Genome Research, Vol. 16(2), 182~189
6. Su Z, Zhang B, Zeng Y, et al. 2006. Gene expression profiling in porcine mammary gland during lactation and identification of breed- and developmental-stage-specific genes. Science in China Series C-Life Sciences, Vol. 49, No. 1, 26~36
5. Gu X* and Su Z. 2005. Web-base resources for comparative genomics. Human Genomics, Vol. 2, No. 3, 187~190 (Review article)
4. Su Z., as one of the listed authors. 2005. The genomes of Oryza sativa: A history of duplications. PLoS Biol 3(2): e38.
3. Zeng Y, Fu Y, Zhang B, Su Z, et al. 2004. Analysis of gene expression profiles in the heart tissues of two breeds of porcines. Acta Genetica Sinica, Vol. 31(6), 565~571
2. Zhang B, Jin W, Zeng Y, Su Z, et al. 2004. EST-based Analysis of Gene Expression in the Porcine Brain. Genomics Proteomics & Bioinformatics, Vol.2, No. 4, 237~244
1. Su Z., as one of the listed authors. 2003. Find of SARS Coronavirus CT Genotype and Its Characterization. Zhejiang Prev Med 15(8): 3~5 (in Chinese)
Associate Professor: 12/2009 – 11/2014 @School of Life Sciences, Fudan University
Postdoctoral Fellow: 02/2010 – 02/2011 @Vanderbilt University School of Medicine, Nashville
Ph.D. in Bioinformatics: 09/2001 – 09/2006 @Zhejiang University
B.S. in Biotechnology: 09/1995-07/1999 @Zhejiang University
Postdoctoral Fellow: 02/2014 – 02/2016 @University of California, Berkeley
Assistant Engineer: 07/2006 – 12/2010 @School of Life Sciences, Fudan University
Ph.D. in Bioinformatics: 09/2008 – 06/2011 @Fudan University
M.S. in Bioinformatics: 09/2004 – 06/2006 @Zhejiang University
B.S. in Horticulture: 09/2000 – 06/2004 @Zhejiang University