Machine Learning Analysis of Enhancer Gene Variation Unlocks the Evolution of Complex Phenotypes in Mammalian Phenotypes

Machine learning is used to relate enhancer genetic variations across mammals to complex phenotypes

The evolution of mammalian phenotypes cannot be explained by differences in protein-coding sequence. It is possible that the evolution of these phenotypes has been influenced by changes in gene transcription. The enhancers that are involved in phenotype development remain largely unknown. The sequence conservation-based methods for identifying these enhancers have limitations because enhancer activity is conserved even if the nucleotides in the sequence are not well conserved. It is because nucleotides are changing at an incredibly high rate. However, the same combination of transcription factors binding sites and sequence features can persist over millions of years, which allows the enhancer’s function to be preserved in a specific cell type or tissue. It is not possible to measure the function of orthologous promoters in dozens of species. However, new machine-learning methods allow us to predict enhancer function in specific tissues or cell types across species.