(C) PLOS One
This story was originally published by PLOS One and is unaltered.
. . . . . . . . . .
AlphaFold2-multimer guided high-accuracy prediction of typical and atypical ATG8-binding motifs [1]
['Tarhan Ibrahim', 'Department Of Life Sciences', 'Imperial College London', 'London', 'United Kingdom', 'Virendrasinh Khandare', 'Department Of Agrotechnology', 'Food Sciences', 'Biochemistry', 'Wageningen University']
Date: 2023-02
Abstract Macroautophagy/autophagy is an intracellular degradation process central to cellular homeostasis and defense against pathogens in eukaryotic cells. Regulation of autophagy relies on hierarchical binding of autophagy cargo receptors and adaptors to ATG8/LC3 protein family members. Interactions with ATG8/LC3 are typically facilitated by a conserved, short linear sequence, referred to as the ATG8/LC3 interacting motif/region (AIM/LIR), present in autophagy adaptors and receptors as well as pathogen virulence factors targeting host autophagy machinery. Since the canonical AIM/LIR sequence can be found in many proteins, identifying functional AIM/LIR motifs has proven challenging. Here, we show that protein modelling using Alphafold-Multimer (AF2-multimer) identifies both canonical and atypical AIM/LIR motifs with a high level of accuracy. AF2-multimer can be modified to detect additional functional AIM/LIR motifs by using protein sequences with mutations in primary AIM/LIR residues. By combining protein modelling data from AF2-multimer with phylogenetic analysis of protein sequences and protein–protein interaction assays, we demonstrate that AF2-multimer predicts the physiologically relevant AIM motif in the ATG8-interacting protein 2 (ATI-2) as well as the previously uncharacterized noncanonical AIM motif in ATG3 from potato (Solanum tuberosum). AF2-multimer also identified the AIM/LIR motifs in pathogen-encoded virulence factors that target ATG8 members in their plant and human hosts, revealing that cross-kingdom ATG8-LIR/AIM associations can also be predicted by AF2-multimer. We conclude that the AF2-guided discovery of autophagy adaptors/receptors will substantially accelerate our understanding of the molecular basis of autophagy in all biological kingdoms.
Citation: Ibrahim T, Khandare V, Mirkin FG, Tumtas Y, Bubeck D, Bozkurt TO (2023) AlphaFold2-multimer guided high-accuracy prediction of typical and atypical ATG8-binding motifs. PLoS Biol 21(2): e3001962.
https://doi.org/10.1371/journal.pbio.3001962 Academic Editor: Anne Simonsen, Institute of Basic Medical Sciences, NORWAY Received: September 30, 2022; Accepted: December 15, 2022; Published: February 8, 2023 Copyright: © 2023 Ibrahim et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability: All relevant data are within the paper and its Supporting Information files. AF2-multimer predictions are uploaded to the public repository figshare and is available at
https://doi.org/10.6084/m9.figshare.21533769.v1. Funding: This work was funded by the Biotechnology and Biological Sciences Research Council (BBSRC, UK) BB/T006102/1. T.O.B and Y.T. was funded by the BBSRC grant (BB/T006102/1). T.I. was funded by a BBSRC-DTP PhD studentship (BB/M011178/1), F.G.M. was funded by a British Society for Plant Pathology Incoming Fellowships. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: I have read the journal’s policy and the authors of this manuscript have the following competing interests: T.B receives funding from industry on NLR biology. TB is a founder and shareholder at Resurrect Bio Ltd. Abbreviations: AF2, AlphaFold2; AIM/LIR, ATG8/LC3 interacting motif/region; hRTN3, human RTN3; IDPR, intrinsically disordered protein region; PCR, polymerase chain reaction; pLDDT, predicted local distance difference test; RMSD, root mean square deviation; SDM, site-directed mutagenesis; SLIM, short linear motif
Introduction To withstand stressful conditions, eukaryotic cells employ a fundamental intracellular catabolic process known as macroautophagy or autophagy. Autophagy is central to cellular homeostasis in all eukaryotes, from humans to yeasts and plants since it mediates adaptation to harmful environmental conditions by eliminating damaged and toxic cellular components as well as invading microbes. Autophagy plays critical roles in various physiological and pathological conditions, particularly defence against pathogens, thus directly affecting plant and human health [1]. Autophagy is a multistep process initiated by the induction of an isolation membrane that expands and closes to form a double-membrane vesicle named the autophagosome. Autophagy cargoes are typically loaded into the inner leaflet of the isolation membrane [2]. Mature autophagosomes are transported to fuse with lysosomes, which in turn digest the captured cargoes [3]. Alternatively, some autophagosomes are re-routed towards the cell surface to discharge cytoplasmic components outside the cell, through a process called secretory autophagy [4]. In yeast (Saccharomyces cerevisiae), autophagy is coordinated by more than 30 core proteins known as the ATG (autophagy-related) proteins that are recruited to autophagosome formation sites in a hierarchical manner [2]. Many of these proteins are also conserved in plants and humans, such as the ubiquitin-like ATG8-family proteins that play central roles in virtually all steps of autophagy, from cargo sequestration to transport and lysosomal fusion of autophagosomes [5,6]. Although yeast has only a single copy of ATG8, the protein has diversified into multiple isoforms in plants (ATG8A–I) and humans (referred to as the LC3/GABARAP family), forming family-specific ATG8 clades [7]. ATG8 associates with proteins that regulate autophagy initiation, such as ATG7, ATG3, and ATG1 [8,9], and decorates both inner and outer autophagosomal membranes to coordinate autophagosome biogenesis and transport. ATG8 sequesters specific cargoes inside the autophagosomes by interacting with autophagy cargo receptors that capture specific cargoes [10,11]. In addition, ATG8 binds to autophagy adaptors on the autophagosome surface to modulate autophagosome transport and fusion events [12,13]. ATG8-interacting proteins carry a short linear motif (SLIM) called the ATG8/LC3 interaction motif/region (AIM/LIR) that interacts with the core autophagy machinery as well as autophagy adaptors and receptors [14]. The core AIM/LIR motif sequence ([W,Y,F][X][X][L,I,V]) consists of an aromatic amino acid followed by any 2 amino acids and a hydrophobic residue. The first aromatic and last hydrophobic amino acids of the core AIM/LIR motif bind to 2 hydrophobic pockets—also known as W (hydrophobic pocket 1) and L (hydrophobic pocket 2) pockets—on the surface of an ubiquitin-like fold in ATG8 proteins. The sequences flanking the aromatic residue of the AIM/LIR typically consist of negatively charged residues, which enhance AIM/LIR docking by forming polar interactions with the positively charged residues surrounding the W and L pockets [15]. ATG8 binding can be improved further via posttranslational modifications, such as phosphorylation of the residues flanking the core AIM/LIR regions [16,17]. However, noncanonical AIM/LIR motifs that mediate ATG8 binding have also been discovered, expanding the spectrum of residues that facilitate ATG8 association [18,19]. The emerging paradigm is that autophagy is primarily orchestrated through sequential binding of core autophagy components, autophagy adaptors, and cargo receptors to the ATG8-family members via canonical/noncanonical AIM/LIR motifs. Therefore, identifying and characterising the AIM/LIR residues has been a crucial step in dissecting the molecular basis of autophagy regulation from initiation to selective cargo sorting and autophagosome transport, which may also be important for understanding newly discovered ATG8ylation process [20,21]. As the canonical AIM/LIR sequence consensus occurs in many proteins, identifying functional AIM/LIR motifs has proven challenging. Thus, a simple search for the [W,Y,F][X][X][L,I,V] amino acid sequence predicts many false AIM/LIR motifs. Since proteins of interest might also have multiple potential AIM/LIR motifs that match the consensus pattern, these motifs must be validated through mutagenesis, peptide binding, and autophagy assays. For instance, the yeast ATG7 has 16 residues that match the core AIM/LIR consensus sequence, yet none have been functionally validated. Likewise, the human NDP52 autophagy cargo receptor has 8 predicted AIM/LIRs, but binds to LC3 through a different, noncanonical LIR [18]. However, a distinctive feature of the AIM/LIR motif that can help narrow down the number of candidates has been discovered: functionally validated AIM/LIR motifs are typically located in intrinsically disordered protein regions (IDPRs) [14]. To help discover functional AIM/LIRs, in silico AIM/LIR prediction tools such as iLIR [22], hfAIM [23], and pLIRm [24] can also be used. Although these computational tools can be used somewhat successfully to predict canonical AIM/LIR motifs, none can detect noncanonical AIM/LIR motifs. Additionally, none of these methods determine the spatial distribution of AIM/LIR motifs or flanking residues on ATG8 proteins; however, this level of resolution combined with the ability to determine side chain interactions established by the flanking AIM/LIR residues would significantly improve investigations into ATG8 binding and help determine the ATG8-binding specificity of various autophagy-related proteins. Furthermore, these tools would accelerate studies aiming to dissect the evolutionary dynamics of autophagy regulation in eukaryotes. The single-chain protein structure prediction tool AlphaFold2 (AF2) [25], and the recently retrained AlphaFold-Multimer (AF2-multimer) system that can predict homomeric and heteromeric interfaces [26], has sparked a renewed interest in applying structural biology to understanding complex cellular processes. In this study, we investigated whether AF2-multimer could identify ATG8-binding structures mediated by AIM/LIR motifs carried by various proteins that have been shown to bind ATG8. AF2-multimer showed 90% accuracy in determining AIM/LIR motifs in 33 experimentally validated proteins that carry functional AIM/LIR motifs. At present, AF2-multimer can identify multiple AIM/LIR residues even after in silico deletion/mutagenesis of the primary AIM/LIR motif. Strikingly, AF2-multimer predicted all 3 noncanonical AIM/LIR motifs that are experimentally validated, as well as the previously uncharacterized AIM motif in plant ATG3, which is not possible with other current prediction tools. Furthermore, AF2-multimer predicted functional AIM/LIRs in 3 of the 4 tested proteins encoded by plant and human pathogens that target host ATG8 proteins, indicating that AF2-multimer can also detect cross-kingdom AIM/LIR–ATG8 interactions. Our study highlights the potential of AF2-multimer in identifying AIM/LIR residues and provides a framework for discovery of new autophagy receptors and adaptors. This AI-guided approach will substantially accelerate our understanding of the molecular basis of autophagy in all kingdoms of life.
Discussion We provide evidence that AF2-multimer can predict both canonical and noncanonical AIM/LIR motifs with high accuracy. In 36 proteins that carry at least 1 experimentally validated canonical AIM/LIR motif, AF2-multimer accurately identified 33 functional AIM/LIRs among 310 candidate sequences that match the canonical AIM/LIR consensus (Fig 1). AF2-multimer can also predict more than 1 functional AIM/LIR in a protein; however, it seems to prioritise displaying the N-terminal functional AIM/LIR. Nevertheless, this limitation can be circumvented by re-running AF2-multimer using truncated/mutated sequences of the first AIM/LIR (Fig 2). AF2-multimer accurately predicted 4 out of 5 experimentally validated noncanonical AIM/LIRs (Fig 3), an ability that is lacking in current AIM/LIR prediction algorithms. Remarkably, AF2-multimer accurately precited a previously uncharacterised noncanonical, functional AIM motif in plant ATG3, which we validated by ATG8 binding and autophagy assays (Fig 3). Furthermore, AF2-multimer performed well in identifying AIM/LIR motifs in 3 out of 4 tested pathogen virulence factors that target ATG8 members in their plant and human hosts, revealing that cross kingdom ATG8–LIR/AIM associations can also be predicted by AF2-multimer. AF2-multimer is therefore quite effective in identifying true AIM/LIR motifs among multiple other non-functional motifs in a protein of interest with high confidence, which minimises the experimental effort needed to validate these motifs. The confidence of these predictions can be strengthened by combining evolutionary information from phylogenetic analysis of protein sequences to determine AIM/LIR conservation, which should be considered when designing time-consuming validation experiments. The AF2-guided AIM/LIR prediction workflow consists of structural prediction of ATG8 and candidate proteins, phylogenetic analysis of predicted AIM/LIR motifs, and experimental validation by in vitro/in vivo binding assays (Fig 4E). If present, additional AIM/LIR residues can be identified by re-running the protocol using protein sequences that are mutated (in silico) in the primary AIM/LIR determined after the first run (Fig 4E). A major advantage of AF2-multimer is that it provides spatial resolution of the AIM/LIR-ATG8 interaction, displaying not only structural alignment of the LIR/AIM residues on LDS/AIM docking site (ADS) but also additional associations through LIR/AIM flanking residues. This is important because it helps determine the extent to which AIM flanking residues condition AIM/LIR specificity towards certain ATG8 isoforms, thereby providing unprecedented insights into evolutionary studies focusing on autophagy regulation. This is especially important for discovering the mode of action of noncanonical AIM/LIR motifs, which are often overlooked as there are no known distinctive features to predict them by. Remarkably, using AF2-mitimer combined with experimental approaches and phylogenetic analysis, we were able to identify and validate a previously uncharacterised functional, noncanonical AIM motif in an IDPR of the plant ATG3 (Fig 3), indicating the AF2-multimer can inform new biology. As reported before [14], an emerging unifying pattern of the canonical and noncanonical AIMs is that they are typically located in IDPRs. Our AF2-multimer predictions are also in agreement with this view, as most AIM/LIR sequences that we analysed are in IDPRs (Figs 1–4). According to our assessments, AF2-multimer correctly predicts the ATG8–AIM interface with greater confidence in some models than in others. However, a high confidence structural prediction of a complete protein is not necessary to identify an ATG8 interface, as these regions can be ordered or disordered based on their interaction partners. The top-ranked AF2-multimer models correctly predicted the functional AIM/LIRs in all tested proteins except for the M2 protein from Influenza A virus. In the case of M2, the rank 1 model had a lower confidence score than the other models, which accurately showed the LIR–LC3 binding interface (Fig 4D). Therefore, models with the highest confidence scores in regions covering the AIM/LIR residues are more informative when determining AIM/LIR–ATG8-binding interfaces. A current limitation of AF2-multimer is that it does not provide any information about the ATG8-binding affinity of the AIM/LIR motif. For instance, phosphorylation of residues flanking the LIR/AIM have been shown to improve ATG8-binding affinity [16,17], but these cannot be determined by AF2-multimer. Nevertheless, since AF2-multimer was still able to identify AIM/LIRs without considering posttranslational modifications, comparative analysis of the side chain interactions of flanking AIM/LIR residues revealed by AF2-multimer could provide further insights into AIM/LIR specificity towards different ATG8 members. Despite its advantages, there is still room to improve AF2-multimer since not all AIM/LIR residues can be predicted. In 3 cases, i.e., the plant autophagy cargo receptor Joka2/NBR1 and the human Calreticulin and DVL2 proteins, AF2-multimer failed to predict the experimentally suggested AIM/LIRs [35,49,53]. We cannot rule out the possibility that ATG8/LC3 binding of such proteins is facilitated by posttranslational modifications [56] and/or interactions with other proteins. Currently, such conditions are not considered by AF2-multimer. Intriguingly, the AF2-models suggest that other binding interfaces exist for these proteins. This raises the question of whether AF2-multimer can be used to predict additional, novel ATG8-binding interfaces. For instance, AF2-multimer predicted a previously unidentified LC3-binding interface in the malaria virulence factor PbUIS3, which lacks a typical LIR motif (Fig 4D). According to AF2-multimer models, the previously characterised PbUIS3 residues (D173, Y174, D175, E205, and K209) that are required for LC3 interaction are not located in the LC3-binding interface, indicating that they might indirectly affect LC3 binding. Consistent with this notion, these residues are also not conserved in other Plasmodium species [54], which contradicts the observations that AIM/LIR motifs are located in conserved, structurally disordered regions [14]. Thus, apart from identifying validated AIM/LIRs, AF2-multimer could be an important tool for determining new ATG8-binding interfaces, providing insights into cases where ATG8/LC3 interaction remains cryptic and developing new hypotheses that can be tested experimentally. Here, we present an AF2-multimer–based framework for identifying AIM/LIR motifs that could be used to discover novel autophagy receptors, adaptors, or modulators as well as pathogen virulence factors that target ATG8 proteins. This framework, in turn, should help address key questions in plant and human autophagy, such as: How are specific cargoes captured and mobilised through selective autophagy? How does specific cargo selection help organisms withstand cellular and environmental stress? How are certain genetic diseases linked to defects in selective autophagy? What are the determinants of degradative versus secretory autophagy? And how do pathogens manipulate autophagy to promote diseases? The AI-guided quest for the discovery of autophagy modulators will substantially accelerate our understanding of the molecular basis of autophagy in all kingdoms of life.
Materials and methods Structural analysis using AF2 We analysed a total of 51 proteins, from 11 different species, that interacted with 12 different members of the ATG8-family of various species. Data can be found in S1 Data. Sequences were obtained from UniProt. Homologs of proteins studied were found using BLAST [57] and their sequences were aligned by Clustal Omega [58]. AF2-multimer [26] was used through a subscription to the Google Colab (
https://colab.research.google.com/github/sokrypton/ColabFold/blob/main/AlphaFold2.ipynb#scrollTo=svaADwocVdwl) following guidelines on the document [25]. Superposition of AlphaFold2 predictions on known structures was performed using the align command in PyMOL (The PyMOL Molecular Graphics System, Version 2.3.5 Schrödinger, LLC). Where indicated, predictions of AIM/LIR interactions are coloured according to the AlphaFold2 produced per-residue confidence metric called the local distance difference test (pLDDT), which corresponds to the model’s predicted score on the lDDT-Cα metric [59]. The scale ranges from 0 to 100, where 100 corresponds to values of highest confidence. Models were coloured by this score in PyMOL using a script generated by J. Murray and D. Pretorius (Imperial College London). Plasmid constructs RFP- and GFP-ATG8CL used in this study were published previously [49]. C-terminal GFP-tagged ATI2 construct was generated by Gibson assembly of PCR fragment amplified from S. tuberosum cDNA into EcoRV linearized pKGC3S GFP expression vector. The DNA fragment for ATI2 AIM1 peptide was custom synthesised (GeneWiz) and inserted into the pKGN3S vector to generate an N-terminal GFP expression construct using Gibson assembly. ATI2-AIM1_(17)AEVA(20), ATI2-AIM2_(265)AQVA(268), and ATIAIM1+2 mutant fragments were generated using site-directed mutagenesis (SDM) polymerase chain reaction (PCR) amplification from wild-type construct. Templates were then eliminated by 1-h Dpn-I (New England Biolabs) restriction digestion at 37°C, and the PCR products of mutants were inserted into EcoRV-digested C-terminal GFP expression vectors using Gibson assembly. ATG3 AIM peptide DNA fragments were custom synthesised (GENEWIZ) and inserted into the pKRN3S vector to generate an N-terminal RFP expression construct using Gibson assembly. Expression vectors were transformed via electroporation into Agrobacterium strain GV3101 for plant expression. Plant material and growth conditions N. benthamiana plants were grown and maintained in a greenhouse with high light intensity (16 h light/8 h dark photoperiod) at 22 to 24°C. Co-immunoprecipitation experiments and immunoblot analysis Proteins were transiently expressed by agroinfiltration N. benthamiana leaves and harvested 2 days post agroinfiltration. Protein extraction, purification, and western blot analysis steps were performed as described previously [60]. In brief, 2 g of leaf tissue was grinded in 4 mL extraction buffer (25 mM Tris-HCl (pH 7.5), 1 mM EDTA, 150 mM NaCl, 10% glycerol (v/v), and 10 mM DTT, 0.1% IGEPAL in the presence of plant protease inhibitor cocktail (Sigma-Aldrich) and 2% polyvinylpolypyrrolidone). Following centrifugation at 17,000 g for 2 times 20 min (filtration in between), the resultant supernatant was incubated with RFP beads (Chromotek) for 1 h at 4°C. Beads were washed at 800 × g 3 times prior to elution with an elution buffer (4xLaemmli Buffer (BioRad) and DTT). Proteins were eluted at 70°C for 5 min. After gel electrophoresis and transfer to PVDF membrane, polyclonal anti-GFP (Chromotek) produced in rabbit, monoclonal anti-RFP (Chromotek) produced in mouse, and monoclonal anti-GFP (Chromotek) produced in rat were used as primary antibodies. For secondary antibodies, anti-mouse antibody (Sigma-Aldrich), anti-rabbit (Sigma-Aldrich), and anti-rat (Sigma-Aldrich) antibodies were used. Confocal microscopy and quantitative analysis of GFP-ATG8CL autophagosome puncta Imaging was performed using Leica Stellaris 5 inverted confocal microscope (Leica Microsystems) using 63× water immersion objective. All microscopy analyses were carried out on live leaf tissue 3 days after agroinfiltration. Leaf discs of N. benthamiana were cut and mounted onto Carolina observation gel (Carolina Biological Supply Company) to minimise the damage. Specific excitation wavelengths and filters for emission spectra were set as described previously [61]. GFP and RFP probes were excited using 488 and 561 nm laser diodes and their fluorescent emissions detected at 495 to 550 and 570 to 620 nm, respectively. Maximum intensity projections of Z-stack images were presented in each figure. Image analysis was performed using Fiji and Inkscape. For quantifying GFP-ATG8 puncta, maximum-intensity projection of each confocal microscopy image was generated using Fiji [62]. The total cell surface area was measured after removing, if there was any, the stomata or pavement cells from the images using the “Freehand selections” tool of Fiji, if the cells were not expressing GFP-ATG8. Since the signal from the ATG8 puncta has been higher than the cytoplasmic-ATG8, “Find maxima” tool was used to measure the number of puncta, with the “prominence” setting between 45 and 60 depending on the signal intensity of each image. The relative number of puncta per 10,000 μm2 was then calculated to plot a graph. The statistical significance of any differences in means was found using the Wilcoxon test in R, because the Shapiro–Wilk test showed the data did not follow normal distribution. Three asterisks (***) in the box plot indicates that the p value is smaller than 0.001.
Acknowledgments Daniella Pretorius (Imperial College London) from Dr. James Murray’s group (Department of Life Sciences, Imperial College London, UK) wrote the script to generate Alphafold confidence score-based colouring in PyMOL. We thank Prof. Sophien Kamoun (The Sainsbury Laboratory, Norwich, UK) for encouraging us to move forward with our initial findings.
[END]
---
[1] Url:
https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.3001962
Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.
via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/