Drivers of metabolic diversification: how dynamic genomic neighborhoods generate new biosynthetic pathways in the Brassicaceae.
Plants produce an array of specialized metabolites with important ecological functions. The mechanisms underpinning the evolution of new biosynthetic pathways are not well understood. Here, we exploit available genome sequence resources to investigate triterpene biosynthesis across the Brassicaceae. Oxidosqualene cyclases (OSCs) catalyse the first committed step in triterpene biosynthesis. Systematic analysis of 13 sequenced Brassicaceae genomes was performed to identify all OSC genes. The genome neighbourhoods (GNs) around a total of 163 OSC genes were investigated to identify Pfam domains significantly enriched in these regions. All-vs-all comparisons of OSC neighbourhoods and phylogenomic analysis were used to investigate the sequence similarity and evolutionary relationships of the numerous candidate triterpene biosynthetic gene clusters (BGCs) observed. Functional analysis of three representative BGCs was carried out and their triterpene pathway products were elucidated. Our results indicate that plant genomes are remarkably plastic, and that dynamic GNs generate new biosynthetic pathways in different Brassicaceae lineages by shuffling the genes encoding a core palette of triterpene-diversifying enzymes, presumably in response to strong environmental selection pressure. These results illuminate a genomic basis for diversification of plant specialized metabolism through natural combinatorics of enzyme families, which can be mimicked using synthetic biology to engineer diverse bioactive molecules.