Ancestral sequence reconstruction of glycopeptide antibiotic (GPA) DNA
Sequences from all GPA BGCs available by May 2018 (from full genomes and the chloroeremomycin plasmid sequences, see Supplementary Table S1 for strains and accession numbers) were used for the ancestral sequence reconstruction. GPA BGCs were identified using antiSMASH 441. While the actual sequence reconstruction was performed based on nucleotide sequences, the guide tree for ancestral sequence reconstruction was based on protein sequences.
Horizontal gene transfer of the GPA BGCs
To infer the evolutionary history of the GPA BGCs we compared the phylogeny of the GPA producer organisms with the phylogeny of the BGCs. A multilocus sequence tree of the producer organisms was generated by autoMLST42. A gene tree using the concatenated protein sequences of the GPA BGCs (see “Construction of the guide tree”) was compared to the species tree using the tanglegram algorithm in dendroscope 343 Fig. 3. Two of branches in the NRPS gene tree are not connected to the species tree, the chloroeremomycin BGC from Amycolatopsis orientalis A82846 and the complestatin BGC from Streptomyces lavendulae, since the no full genomes were available for these strains. The full trees and tanglegram are available as supplementary data.
Construction of the guide tree
For the guide tree protein sequences from NRPS genes containing modules 1–7 were used. For each module, a robust alignment was generated with MAFFT-E-INSi44 implemented in geneious 9.1.6 (https://www.geneious.com) using the default settings (blosum62 as scoring matrix for the amino acid sequence and a gap open penalty of 1.53). Trimming was performed with GBlocks using the web interface provided by phylogeny.fr at default settings45,46. Trimmed alignments for each module were then concatenated. Tree building was performed with RaxML implemented in geneious 9.1.6 using the blosum62 matrix and GAMMA model of rate heterogeneity47. A midpoint rooted tree and an outgroup rooted tree, using the NRPS genes from Nonomuraea sp. ATCC 55076 (kistamicin) and Streptomyces lavendulae (complestatin) as outgroup, were built and compared (Supplementary Fig. S1). Since the midpoint rooted tree only marginally differed from the outgroup rooted tree, and this did not affect tree topology at the node used for ancestral sequence reconstruction, the midpoint rooted tree was used as guide tree for sequence reconstruction and ancestral state analysis. The paleomycin GPA sequence was reconstructed at node N1 (Supplementary Fig. S2).
For the ancestral sequence reconstruction nucleotide sequences form the NRPS genes were used in a codon alignment. Thereby each gene was aligned separately. In case of the van/pek (type I) GPAs modules 1–3 are merged on a single gene. Here modules 1 and 2 were aligned with the first NRPS gene and module 3 was aligned with the second NRPS gene. Furthermore, the sequences of the MbtH-like genes were aligned and used for reconstruction of the MbtH ancestor.
For the sequence alignment three different programs were compared since they were reported to exhibit the best performance for ancestral sequence reconstruction accuracy:48 (1) MAFFT E-INSi protein alignment used as a basis for codon alignment using a pal2nal.pl script44,49. (2) MUSCLE protein alignment used as a basis for codon alignment using a pal2nal.pl script49,50 (3) Prank codon alignment51. All alignment programs were run under default settings.
Ancestral sequences were reconstructed with FastML52. Marginal reconstruction was chosen here, since it is considered best for a “sequence-centric” task48. JC69 (default) was used as the substitution model for tree inference and sequence reconstruction. As the reconstruction algorithm either considers all or no indels, sequences were manually trimmed according to the following criteria: (i) If insertions were present in only 1–7 sequences, the insertions were removed. (ii) If insertions were present in more sequences, tree phylogeny was considered to make the decision if it was likely that the insertion was part of the ancestral sequence or if it was only present in sublineages that likely evolved later. (iii) If no decision was possible based on the former two criteria, insertions were trimmed in accordance with A. japonicum MG417-CF17 as a reference. The best alignment algorithm for sequence reconstruction was chosen after visual inspection of the differences in the three reconstructed sequences. All trees are available at as supplementary data on zenodo (10.5281/zenodo.8410710) [https://zenodo.org/records/8410710].
Ancestral state reconstruction
Ancestral state reconstruction was performed for all biosynthesis genes in the GPA clusters that were not conserved in all GPA producer strains. This comprised genes encoding the P450 monooxygenase OxyE, halogenases, glycosyltransferases, acyltransferases, methyltransferases, sulfotransferases, the genes for vancosamine biosynthesis (evaA-E), a β-hydroxylase gene, and the three gene cassette to produce β-hydroxytyrosine (bhp, bpsD and oxyD). Regulators, transporters, and resistance genes were not considered. Furthermore, the ancestral states were reconstructed for the NRPS genes considering either the state of three NRPS genes with a GPA backbone of five aromatic and two aliphatic amino acids (type I GPAs) or the state of four genes with a GPA backbone of seven aromatic amino acids (type II-IV GPAs). Ancestral states were reconstructed with Mesquite 3.70, using the maximum likelihood algorithm with a Markov k-state 1 parameter model53 and the concatenated NRPS tree as guide tree. To identify the closest relatives to the GPA biosynthesis genes, BLAST search was used under default settings, in some cases extending the search to 250 hits54. For characterisation of the glycosyltransferases the CAZy (Carbohydrate Active Enzymes) database was used as a reference. 55 Here, GT1 group genes with a characterised function were used to construct a phylogenetic tree. Sequences were aligned using MAFFT e-INS-i. Phylogenetic trees were calculated using IQtree56. Thereby substitution models were chosen based on a preceding model test57 and bootstrapping was performed using UFBoot for ultrafast bootstrapping58.
Recombination in GPAs
To identify recombination in the GPA NRPS genes, the average number of nucleotide differences per site between two sequences (π) was calculated using the sliding window analysis implemented in DnaSP659. Modules 1–3 of the GPA NRPS genes were pairwise aligned using MAFFT e-INS-i. The window size was chosen to be 100 nt with a step size of 25 nt (Fig. S28), additionally, a window size of 300 nt and the step size 150 nt was analyzed, as previously described by ref. 28 (Supplementary Data).
Phylogenetics and ancestral sequence reconstruction of A-domains
A protein multiple sequence alignment (MSA) was generated for the A-domains of each homologous module using MAFFT, with blosum62 as the scoring matrix for the amino acid sequence, a gap open penalty of 1.53, and a gap extension penalty of 0.123. This resulted in seven alignments44. The protein alignment was further used to generate a codon alignment through PAL2NAL, a Perl script that converts protein MSAs and corresponding DNA sequences into codon alignments49. The resultant MSAs were manually trimmed and concatenated, leading to two MSAs: (I) a protein MSA comprising all A-domains from modules 1–7 for each strain, and (II) the corresponding codon MSA. These two alignments then served as the basis for constructing a phylogenetic tree using IQ-tree version 1.6.3. IQ-tree employs a fast and efficient maximum likelihood tree search algorithm for data set processing60,61. For the phylogeny construction, JTT + F + R5 was chosen as the best-fit substitution model, as determined by the in-built model finder in IQ-tree57. Bootstrapping was executed with 500 repetitions using UFBoot for ultrafast bootstrapping62. The trees were subsequently visualised with Dendroscope43.
For ancestral sequence reconstruction, all A-domain sequences from module 1 (Supplementary Fig. S45) underwent realignment using webPRANK—a phylogeny-aware multiple sequence aligner—with the previously generated phylogeny as input63. This step was carried out to model gap patterns more accurately, as a phylogeny-aware aligner recognises insertions and deletions as distinct evolutionary events. The phylogenetic tree, together with the webPRANK-aligned protein sequences from module 1, was used for ancestral sequence reconstruction using FastML52. The marginal reconstruction employed the “yang” substitution model, which is the default for codon sequences.
Chemicals and reagents
All strains and plasmids used in this work are listed in Table S2. Oligonucleotides were synthesised either by Integrated DNA Technologies (Leuven, Belgium) or by Sigma Aldrich (Castle Hill, Australia) (Table S3). Sanger sequencing were performed at either the Eurofins Genomics, Ebersberg, Germany or Garvan Molecular Genetics, Darlinghurst, Australia. Q5 Hi-Fidelity DNA polymerase (Thermo Fisher Scientific) or Phusion® High-Fidelity DNA Polymerase (NEB) was used for PCR screening and PCR amplification. PeqGOLD Plasmid Miniprep Kit II and PeqGOLD bacterial DNA Kit (VWR Life Science) were used for plasmid purification and isolation of genomic DNA. In-fusion cloning was performed using the kit from Takara Bio. Stellar™ Competent Cells were provided by Takara Bio.
Restriction enzymes are purchased from Thermo Fisher Scientific. Required reagents were achieved from Difco, Sigma-Aldrich, Merck, or Chem Supply.
Generation and isolation of ristomycin/paleomycin hybrid (Fig. 31)
Cultivation of bacterial strains
Escherichia coli Nova blue were used for cloning purposes, and the methylation-deficient strain E. coli ET12567 pUB307 was used for intergeneric conjugation.
Amycolatopsis japonicum, the ristomycin A producer was used to generate the deletion mutant A. japonicum DI (this work)15.
E. coli strains were grown in Luria broth (LB) medium supplemented with 100 μg.mL-1 apramycin or 100 μg.mL−1 hygromycin for selective pressure at 37 °C. Liquid cultures of A. japonicum/A. japonicum DI/ A. japonicum p3SV/ A. japonicum DI_nrpsanc_bbr_BHH were cultivated in 50 mL or 100 mL Amycolatopsis production medium (20 g.L−1 glucose, 20 g.L−1 galactose, 10 g.L−1 Bacto Soytone, 2 g.L−1 (NH4)2SO4, 2 g.L−1 CaCO3 in 1 L water pH 7.4), in 100 mL or 500 mL Erlenmeyer flasks with steel springs at 29 °C and 180 rpm for 3 to 5 days. Liquid/solid media were supplemented with 50 μg.mL−1 apramycin and/or with 25 μg.mL−1 hygromycin to select for strains carrying integrated antibiotic-resistance genes.
Integration of the Sp44* promoter upstream of the ancestral NRPS genes
The integration of the strong artificial Sp44* promoter64 (Supplementary Fig. S32) was performed by in-fusion cloning using the kit from Takara Bio, according to the manufacturer’s protocol. The primers used to amplify the Sp44*, SP_OH_Fw/SP_OH_Rv, (Supplementary Table S3), the pDM (p3SV_nrpsanc) was linearised by EcoRV.
Deletion of rpsA-rspD in A. japonicum to construct A. japonicum DI
To delete the NRPS genes rpsA-rpsD the deletion plasmid pGUSA21_Risto_KO was constructed. Therefore, the vector pGUSA21 was used, containing the gus (β-glucuronidase) gene as selection marker, the upstream and the downstream flanking regions of rpsA and rpsD, fragments with sizes of 1.3 and 1.5 kb, respectively. The fragments were amplified by PCR using the genomic DNA of A. japonicum as template and the primers Risto_KO_UP fw, Risto_KO_UP rv, Risto_KO_DO fw, Risto_KO_DO rv containing NdeI/XbaI and SphI/HindIII restriction sites at the 3′ and 5′ ends (Supplementary Table S3). The PCR amplicons Risto_KO_UP, Risto _KO_DO, were separately introduced into pJET blunt vector yielding pJET_Risto_KO_UP and pJET_Risto _KO_DO. The fragments Risto_KO_UP, Risto _KO_DO were excised from pJET using NdeI/XbaI for Risto_KO_UP, and SphI/HindIII for Risto _KO_DO and cloned into pGUSA21, resulting in pGUSA21_Risto_KO. pGUSA21_Risto_KO was transferred into E. coli ET12567 pUB307 and finally into A. japonicum by intergeneric conjugation. The integration of the plasmid was determined by blue-white screening by plating the transconjugants on MS plates containing 20 mM X-gluc (5-bromo-4-chloro-1H-indol-3-yl-β-D-glucuronic acid). The transconjugants containing pGUSA21_Risto_KO (single crossover mutant) were used for the generation of the in-frame deletion mutant A. japonicum DI (S2). To generate an A. japonicum DI gene deletion mutant, in which a double crossover event via the second cloned fragment occurred, colonies A. japonicum_ pGUSA21_Risto_KO were cultivated for two days in 50 ml R5 medium at 29 °C and 180 rpm under apramycin selection. Afterwards, the mycelium was washed and used for inoculation of 50 ml fresh R5 medium without antibiotic selection and cultivated by 37 °C and 180 rpm for 24 h. The cultures were then centrifuged. The mycelium was fragmented by incubation with lysozyme for 15 min and the protoplasts were prepared as described by Matsushima and Baltz2. Diluted protoplasts were plated on MS agar plates overlaid with 20 mM X-gluc and white colonies (Supplementary Fig. S33) were picked on new plates. Once double cross‐over recombinants were obtained, total DNA was isolated from selected clones and the targeted regions were amplified by PCR (Supplementary Fig. S34).
Transfer of pIJ_bbr into A. japonicum DI
For the overexpression of the pathway-specific StrR-like regulator under the control of the constitutive promoter ermEp*65, the integrative pIJ_bbr vector (Lab Stock)66 was transferred into A. japonicum DI by intergeneric conjugation. The trans conjugants of A. japonicum carrying the pIJ_bbr plasmid were selected on hygromycin plates and confirmed by PCR (Supplementary Fig. S35) using the primer pair pIJ fw/pIJ rv.
Optimisation of A. japonicum DI_bbr for heterologous expression of GPA BGCs
To further optimise A. japonicum DI_bbr to produce paleomycin the genes responsible for β-hydroxytyrosine biosynthesis were deleted (Supplementary Fig. 37) and replaced by two genes encoding a β-hydroxylase and a halogenase from Nonomurea gerenzanensis, resulting in A. japonicum DI_bbr_BHH. To achieve this replacement, we implemented the same strategy as for the deletion of the NRPS genes (Supplementary Fig. S33). To amplify the corresponding fragments, the primers Aj-32060 fw, Aj-32060 rv, BHH-fw, BHH-rv, Aj-32040 fw and Aj-32040 rv were used (Supplementary Table S3, Supplementary Fig. S36). The β-hydroxylase and the halogenase genes were amplified from a previously constructed plasmid in our lab (pBHH).
Transfer of p3SV and the pDI1 into
A. japonicum DI/ A. japonicum DI_bbr_BHH
The transfer of the empty plasmid p3SV and pDI1 (harbouring the ancestral nrps genes (nrpsanc) under the control of the SP44*), respectively, into A. japonicum DI/A. japonicum DI_bbr_BHH was carried out using a standard protocol for intergeneric conjugation in actinomycetes. The empty plasmid p3SV and pDI1 (p3SV containing the ancestral NRPS genes (nrpsanc; the 30 kb synthetic construct) under the control of the constitutive promoter Sp44*), were transferred into the methylation-deficient strain E. coli ET12567 pUB307 and finally into the chromosomes of A. japonicum DI/A. japonicum DI_bbr_BHH via the ΦC31 attB sites by intergenetic conjugation. Overnight cultures of the donor E. coli strains and fresh mycelium of the recipient strains were combined in microcentrifuge tube and mixed by pipetting. The mixture was plated on non-selective plates containing MgCl2 and incubated overnight at 29 °C. The recombinant mutants A. japonicum_p3SV (negative control) and A. japonicum DI_nrpsanc_bbr_BHH (with integrated pDI1) were selected on apramycin plates and confirmed by PCR using the primer pair Apra fw/Apra rv for the p3SV (Fig. 38a) and the primer pair PacI GPC_fw/bla vec_rv for pDI1 (Figs. 37b; Supplementary Table S3).
Production of ristomycin/paleomycin hybrid GPA in A. japonicum DI_nrpsanc_bbr_BHH
The negative control A. japonicum DI_p3SV and A. japonicum DI_nrpsanc_bbr_BHH were grown for 3 days on petri dishes containing 30 ml of MS-agar. For the cultivation in liquid cultures 1 cm2 mycelium was scraped and used to inoculate 50 mL TSB medium as a preculture. 5 mL of 3 days-old preculture were used for inoculation of 100 mL Amycolatopsis production medium. Fermentations were carried out for 5 days at 29 °C in a rotary shaker at 180 rpm. Cultures were centrifuged at 6441 g for 15 min and the culture filtrates were used for HPLC-MS and MS-MS measurements.
Isolation and purification of paleomycin/ristomycin hybrid GPA
The cultivation and extraction protocol were scaled up to 1 L Amycolatopsis production medium, which was distributed to 10 Erlenmeyer flasks. Purification was accomplished by adsorption chromatography on XAD16 resin (200 mL). The column was washed with H2O (800 mL), MeOH (20%), and MeOH (30%). The GPA was eluted with MeOH (800 mL, 100%) and concentrated by evaporation. This step was followed by size-exclusion chromatography on Sephadex LH20. All purification steps were monitored by HPLC-ESI-MS.
Detection of paleomycin/ristomycin hybrid GPA by HPLC-MS
The production of paleomycin was detected using an Agilent 1200 HPLC System (Agilent Technologies, Waldbronn, Germany) coupled to an LC/MSD Ultra Trap System XCT 6330, Agilent Technologies, Waldbronn, Germany). Chromatographic separation was performed at a flow rate of 400 μL.min−1 using stationary phase C18 column Nucleosil 100 3 μm (100 × 2 mm ID, fitted with a precolumn 10 × 2 mm, same stationary phase, Dr. Maisch GmbH, Ammerbuch) with the mobile phase composed of formic acid (A = 0.1%), and formic acid in acetonitrile (B = 0.06%). A gradient from 0 to 100% of B in 15 min with a 2-min hold at 100% for solvent B, was used.
Biological activity assays
Biological activity of the A. japonicum DI_nrpsanc_bbr_BHH culture filtrate was investigated by a disc diffusion assay. The paleomycin/ristomycin hybrid GPA containing solutions and controls (25 μL) were spotted on agar plates containing the indicator strain B. subtilis (Supplementary Fig. S42).
LC-HR-MS/ MS analysis of ristomycin/paleomycin hybrid
Mass spectrometry data acquisition
For UHPLC-MS/MS analysis 2 µL of the samples were injected into vanquish UHPLC system coupled to a Q-Exactive HF quadrupole orbitrap mass spectrometer (running Q Exactive HF Tune 2.12, Thermo Fisher Scientific, Bremen, Germany). For reversed-phase chromatographic separation, a C18 core-shell micro-flow column (Kinetex C18, 50 × 1 mm, 1.8 um particle size, 100 A pore size, Phenomenex, Torrance, USA) was used. The mobile phase consisted of solvent A (H2O + 0.1% formic acid (FA)) and solvent B (acetonitrile (ACN) + 0.1% FA). The flow rate was set to 150 µL/min (setup A) or 100 µL/min (setup B). A linear gradient from 5–50% B between 0–8 min and 50–99% B between 8 and 10 min, followed by a 3 min washout phase at 99% B and a 5 min re-equilibration phase at 5% B was used.
Data Independent Acquisition (DIA) and Data-dependent acquisition (DDA) of MS/MS spectra was performed in positive mode. Electrospray ionization (ESI) parameters were set to 40 arbitrary units (arb. units) sheath gas flow, auxiliary gas flow was set to 10 arb. units and sweep gas flow was set to 0 AU. The auxiliary gas temperature was set to 400 °C. The spray voltage was set to 3.5 kV and the inlet capillary was heated to 320 °C. S-lens level was set to 70 V applied. MS scan range was set to 800–2000 m/z with a resolution Rm/z 200 of 45,000 or 240,00 with one micro-scan. The maximum ion injection time was set to 100 ms with automatic gain control (AGC) target of 5E5. Either two or five MS/MS spectra per duty cycle were acquired at R m/z 200 15,000, 120,000, or 240,000 with one micro-scan. The maximum ion injection time for MS/MS scans was set to 100 ms with an AGC target of 5.0E5 ions and a minimum of 5% AGC. The MS/MS precursor isolation window was set to m/z 1. The normalised collision energy was set to 20 or 25% with z = 1 as the default charge state. MS/MS scans were triggered at the apex of chromatographic peaks within 2 to 15 s from their first occurrence. Dynamic precursor exclusion was set to 5 s. Ions with unassigned charge states were excluded from MS/MS acquisition, as well as isotope peaks.
Mass spectrometry data analysis
Raw data conversion and peak picking were performed with MSconvert. Centroided data was visualised and manually inspected using the GNPS dashboard (https://dashboard.gnps2.org/)67. Classic Molecular Networking was generated with the GNPS platform (gnps.ucsd.edu) using default settings18 other than precursor and product ion tolerance, which were both set to 0.01 m/z. The Molecular Network was the visualised in Cytoscape68 and connected MS/MS spectra were manually interpreted. Exact masses and isotope patterns were calculated using EnviPad (www.envipat.eawag.ch) and manually compared using the raw profile data in Qual-Browser (Thermo Fisher, Bremen Germany).
In vitro characterisation and crystallisation of GPA A-domains
Cloning of A-domain constructs from teicoplanin module 1
A-domains comprise two domains (Acore and Asub); as the binding of amino acid substrates occurs within the Acore domain at the Acore-Asub interface, two constructs were designed for each A-domain: a full-length A-domain containing both the Acore and Asub domains for biochemical characterisation, and an A-domain lacking the flexible 10 kDa C-terminal Asub for crystallography. The sequence for A1Tei was amplified by PCR using Phusion® High-Fidelity DNA Polymerase (NEB) from a synthetic gene encoding the enzyme Tcp9 (Uniprot ID Q70AZ9, Supplementary Table S3) that had been synthesised and codon optimised for expression in E. coli by Eurofins Genomics, Ebersberg, Germany. The primers (Supplementary Table S3) were designed to create overhangs compatible for integration the pHis17 vector using In-Fusion cloning (Takara). The In-Fusion reaction and PCRs were performed as indicated by the manufacturer’s protocols. The pHis17 vector was linearised using primers pHIS17 fw and pHIS17 rv. The primers A1_tcp9 fw and A1_tcp9 rv or A1_tcp9 fw with A1core_tcp9 rv were used to generate A1Tei (residues 9-492) and A1core-Tei (residues 9-398) amplicons, respectively. The PCR amplicons were separately introduced into pHis17.
Cloning of ancestral A-domains
Sequences derived from ASR (Supplementary Table S3) were codon optimised for Escherichia coli BL21 (DE3) cells (Novagen) and synthesised in single fragments by integrated DNA technologies (IDT). Fragments were cloned into a modified version of the pOPIN-S vector, comprising an N-terminal hexahistidine-SUMO (Small Ubiquitin-like Modifier) tag and a C-terminal STREP tag, using In-Fusion® HD Cloning Plus (Takara). The primer pairs ANC1 fw + ANC1 rv, ANC2 fw + ANC2 rv, ANC3 fw + ANC3 rv, and ANC4 fw + ANC4 rv were used for amplification of ANC1 to ANC4, respectively. Meanwhile the pOPIN-S vector was linearised using the primer pair pOPIN-S_STREP fw + pOPIN-S_STREP rv. The ANC4_A1core (residues 1–391) fragment was amplified using the primers ANC4_A1core fw and ANC4_A1core rv and integrated into pHis17 using the linear vector backbone generated using primers pHIS17 fw and pHIS17 rv.
Cloning of pocket graft Acore-constructs
To introduce the residues mutated in the substrate binding pocket of ANC2 and ANC3 into the A1core-tei construct, we used In-Fusion cloning. The A1core-tei vector was linearised using primers A1core-tei_graft fw and A1core-tei_graft rv. Two fragments with compatible overhangs and either the H237Y and L295V mutations for A1core-ANC2 or the H237Y, L287M and L295M mutations for A1core-ANC3 was synthesised as single fragments by IDT and used for the In-Fusion reaction.
Protein expression and purification of A-domains
All A-domain constructs were co-transformed with a pCDF-1b construct encoding the MbtH-like protein Tcp13 from Actinoplanes teichomyceticus (Uniprot ID: Q70AZ5) into E. coli BL21 (DE3) cells. Cells were grown overnight at 37 °C with shaking in lysogeny broth (LB) medium (Miller et al. 1992) supplemented with 100 µg/mL ampicillin and streptomycin as a preculture. To start cultivation, the preculture was pelleted and resuspended in fresh LB before inoculation into ten 2 L Erlenmeyer flasks containing 1 L of LB medium per flask (supplemented with 100 µg/mL ampicillin and streptomycin). Cultures were grown at 37 °C until an OD600 nm of 0.5 was reached. Subsequently, the temperature was decreased to 18 °C and protein expression was induced with isopropyl β-D-1-thiogalactopyranoside (IPTG) at a final concentration of 0.2 mM. Cells were harvested after 20 h of growth by centrifugation at 6000 g for 30 min. The cells were subsequently resuspended in lysis buffer (300 mM NaCl, 20 mM imidazole, 20 mM Tris–HCl, pH 8.0) supplemented with EDTA-free protease inhibitor (SigmaFAST™ Protease Inhibitor Cocktail Tablet) and subsequently lysed by sonication. The total lysate was clarified by centrifugation at 12,000 g for 30 min at 4 °C prior to protein purification. The supernatant was loaded onto a nickel-chelating column (HisTrap Fast Flow crude, 5 mL, GE Healthcare) pre-equilibrated in lysis buffer using an Akta Pure Protein Purification System (running Unicorn 7, Cytiva). The target protein was eluted in fractions with a linear gradient from 20 mM to 300 mM imidazole over 20 column volumes. The fractions collected were analysed by SDS-PAGE and the purest fractions were collected and concentrated using an ultra-centrifugal filter (Amicon Ultra—15) with a molecular weight cut-off (MWCO) of 30 kDa. The concentrated fractions were further purified by size-exclusion chromatography in SEC buffer (20 mM Tris–HCl, pH 8.0 and 300 mM NaCl) using a Sephacryl S-300 hr 16/60 (GE Healthcare) column for A1tei constructs or an SRT-10 SEC-300 (SEPAX) column for ancestral A-domains. The fractions collected were analysed by SDS-PAGE. Fractions containing monomeric protein were chosen based on elution profile and SDS-PAGE gel. Every step of purification was performed either on ice or at 4 °C. Final purified proteins were then concentrated to a minimum concentration of 20 mg/mL using centrifugal filtration (Amicon Ultra—15) with a molecular weight cut-off (MWCO) of 30 kDa before being snap-frozen in liquid nitrogen and stored at −80 °C.
To identify crystallisation conditions, initial broad matrix screens were performed at the Monash Molecular Crystallisation Platform (MMCP). Crystals of Tcp9 A1core-Tei in complex with MbtH-like protein Tcp13 were obtained at a concentration of 10 mg/mL in drop D3 (0.1 M MMT buffer, pH 6.0, with 25% (w/v) PEG 1500) of the PACT premier crystallisation screen (Molecular Dimensions) at 20 °C using the sitting drop vapour diffusion method. Crystallisation drops for all four A-domain complexes contained 1 µL of protein solution mixed with equal amounts of precipitant and were equilibrated against 300 µL of precipitant solution containing the following: (i) for Tcp9 A1core-tei/Tcp13, drop D3 from the PACT premier screen (Molecular Dimensions); (ii) for Tcp9 A1core-ANC2/Tcp13 and Tcp9 A1core-ANC3/Tcp13, drop D12 (0.04 M Potassium phosphate monobasic, 16% (w/v) PEG 8000, and 20% (v/v) Glycerol) from the JCSG+ screen (Molecular Dimensions); and (iii) for ANC4core/Tcp13, drop H4 (1.6 M Magnesium sulfate and 0.1 M MES, pH 6.5) from JBScreen Classic HTS II screen (Jena Bioscience).
To obtain substrate bound structures, the crystals were soaked for 2–5 min in the reservoir solution supplemented with 30% sucrose and 4-10 mM L-Hpg, L-Leu or D-Ala before they were mounted on cryo-loops and vitrified in liquid N2 prior to X-ray data collection. High-resolution synchrotron diffraction data at 100 K were collected on the MX269 beamline at the Australian Synchrotron (Clayton, Victoria, Australia) equipped with an Eiger detector (Dectris).
Data were processed and scaled using the routines XDS, Pointless, and Aimless from the CCP4 suite70. During the refinement process, 5% of the reflection data was set aside as R_free for cross-validation and was not used during any stage of the refinement. We applied a high-resolution cut-off criterion of CC1/2 > 0.3. Data collection statistics are listed in Table S3. The phases for structure determination were obtained by molecular replacement using PHASER from the PHENIX package and the structure of phenylalanine activating domain of gramicidin synthethase 1 from Brevibacillus brevis (PDB ID: 1AMU)33 as a search model for 8GJ4. Using the Phaser solution obtained (two dimers of Tcp9 A1core-tei/Tcp13). We utilised PDB tools to randomise ADPs and conducted Cartesian-simulated annealing with PHENIX to avoid phase model bias and as recommended by the developers71. The final refined model was generated using iterations between manual real-space refinement in COOT72 and automated refinement in PHENIX71. Initial stages of refinement primarily involved manual rebuilding, employing basic refinement options such as reciprocal and real space refinement and individual atom isotropic B factors with default NCS restraints. NCS was applied between Acore chains A and B, and between the MbtH-like protein chains C and D. X-ray data for geometry weights and atomic displacement factors were automatically determined using the “optimise X-ray/stereochemistry weight” and “optimise X-ray/ADP weight” functions, respectively. Model validation was carried out using COOT72 and MolProbity73. Statistics were generated in PHENIX through the “Generate Table 1 for journal” function. Chain A and D from final model of 8GJ4 was used as search model for 8GIC, 8GJP and 8GKM during molecular replacement followed by the same refinement strategy as for 8GJ4.
NADH coupled pyrophosphate assay
PPi release assays30,74 were performed at 30 °C and the data collected using a V-650 spectrophotometer (running SpectraManager II, Jasco). Each reaction was performed in a total volume of 500 µL. The assay buffer (50 mM Tris-HCl pH 7.4, 10 mM MgCl2 and 0.01 mM EDTA) was supplemented with 1 mM D-fructose-6-phosphate, 0.1 U mL−1 fructose-6-phosphate kinase, pyrophosphate- dependent (Propionibacterium freudenreichii (shermanii)), 1 U mL−1 aldolase, 5 U mL−1 triosephosphate isomerase, 5 U mL−1 glycerophosphate dehydrogenase and 0.2 mM NADH. To measure the activity of excised A-domains, 10 µM enzyme was added to the assay buffer together with 0.5 mM ATP. Reactions were incubated for 5 min and the reaction was then commenced by the addition of 1 mM substrate. All assays were performed in triplicates. Slopes were fitted using the SpectraManager II software and the fitted data was analysed using GraphPad Prism 8. Velocity was calculated from the slope of the linear phase using the Beer– Lambert law (v = slope (Abs/min)/(ε340(NADH)*l*2).
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.