(C) PLOS One
This story was originally published by PLOS One and is unaltered.
. . . . . . . . . .
The defense island repertoire of the Escherichia coli pan-genome [1]
['Dina Hochhauser', 'Department Of Molecular Genetics', 'Weizmann Institute Of Science', 'Rehovot', 'Adi Millman', 'Rotem Sorek']
Date: 2023-04
It has become clear in recent years that anti-phage defense systems cluster non-randomly within bacterial genomes in so-called “defense islands”. Despite serving as a valuable tool for the discovery of novel defense systems, the nature and distribution of defense islands themselves remain poorly understood. In this study, we comprehensively mapped the defense system repertoire of >1,300 strains of Escherichia coli, the most widely studied organism for phage-bacteria interactions. We found that defense systems are usually carried on mobile genetic elements including prophages, integrative conjugative elements and transposons, which preferentially integrate at several dozens of dedicated hotspots in the E. coli genome. Each mobile genetic element type has a preferred integration position but can carry a diverse variety of defensive cargo. On average, an E. coli genome has 4.7 hotspots occupied by defense system-containing mobile elements, with some strains possessing up to eight defensively occupied hotspots. Defense systems frequently co-localize with other systems on the same mobile genetic element, in agreement with the observed defense island phenomenon. Our data show that the overwhelming majority of the E. coli pan-immune system is carried on mobile genetic elements, explaining why the immune repertoire varies substantially between different strains of the same species.
Bacteria are commonly infected by viruses called bacteriophages (or phages, for short). To survive phage infection, bacteria employ multiple anti-phage defense systems, many of which were discovered only in recent years. Intriguingly, multiple studies showed that different strains of the same species can encode completely different sets of defense systems, but the reason for the diversification of defense systems among otherwise nearly identical genomes was unknown. Here, we systematically characterized defense systems in >1,300 genomes of the model lab strain Escherichia coli. We find that anti-phage defense systems are almost always carried on mobile genetic elements such as prophages, transposons and conjugative elements. These elements integrate at specific locations, or “hotspots”, within the E. coli genome. Different anti-phage defense systems are carried by distinct types of mobile genetic elements that preferentially integrate at specific hotspots, explaining why phage resistance profiles can vary significantly even among closely related E. coli strains. Our findings not only provide a comprehensive view of the distribution of anti-phage defense systems in E. coli genomes, but also shed light on the rapid gain and loss of defense systems in short evolutionary time scale.
Competing interests: I have read the journal’s policy and the authors of this manuscript have the following competing interests: R.S. is a scientific cofounder and advisor of BiomX and Ecophage. The other authors declare that they have no competing interests.
Funding: R.S. was supported, in part, by the European Research Council (grant ERC-AdG GA 101018520), Israel Science Foundation (grant ISF 296/21), the Deutsche Forschungsgemeinschaft (SPP 2330, grant 464312965), the Ernest and Bonnie Beutler Research Program of Excellence in Genomic Medicine, the Minerva Foundation with funding from the Federal German Ministry for Education and Research, and the Knell Family Center for Microbiology. D.H. was supported in part by a fellowship from the Israel Ministry of Absorption. A.M. was supported by a fellowship from the Ariane de Rothschild Women Doctoral Program and, in part, by the Israeli Council for Higher Education via the Weizmann Data Science Research Center.
Copyright: © 2023 Hochhauser et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
In the current study, we set out to map and investigate the repertoire of mobile defense systems in the Escherichia coli pan-genome. E. coli is the most well characterized model organism for bacteria-phage interactions, but the arsenal of defense systems in its genome and their preferred mode of mobilization have never been studied thoroughly. By analyzing over 1,300 E. coli genomes, we demonstrate that defense systems are almost always carried by MGEs. MGEs carrying defense systems have a marked preference of defensive cargo, as well as preferred integration hotspots within the E. coli genome, explaining the considerable variation in phage resistance observed between closely related E. coli strains. Our analysis forms a repository of defense islands in E. coli strains, a database that may serve as a resource for the discovery of new defense systems in the future.
Although defense islands have served as a remarkably useful tool for the discovery of new defense systems, reasons for the genomic co-localization of defense systems and the nature of defense islands themselves remain poorly understood. Recent evidence suggests that defense systems are frequently carried on mobile genetic elements (MGEs). These include integrative and conjugative elements (ICEs) [ 18 ], transposons [ 19 ], prophages and phage satellites [ 5 , 8 ]. It was shown that these MGEs can possess dedicated hotspots for carrying multiple anti-phage defense systems. Furthermore, several independent studies have demonstrated that MGEs carrying defense systems were directly responsible for differential phage resistance profiles in closely related strains of Vibrio cholera and V. lentus [ 18 , 20 ]. It has been hypothesized that anti-phage defense systems carried by MGEs participate in inter-MGE warfare and play a role not only in defending the host bacterium against invading phages, but also in protecting resident MGEs against invading MGEs [ 21 , 22 ].
Bacterial anti-phage defense systems were shown to be non-randomly distributed in microbial genomes [ 6 , 9 , 10 ]. Such systems were observed to frequently co-localize in bacterial and archaeal genomes, forming so-called “defense islands”: genomic regions in which multiple defense systems cluster together [ 6 , 9 , 10 ]. The tendency of defense genes to reside next to one another has enabled the discovery of dozens of novel phage resistance systems based on their genomic presence next to known defense systems [ 4 , 6 , 7 , 11 – 17 ].
Bacteria are engaged in a continuous arms race in which they have evolved to defend themselves against the expanding arsenal of weapons at the disposal of phages [ 1 ]. To this end, they possess dedicated defense systems that protect against phage infection through a variety of molecular mechanisms [ 2 , 3 ]. Many defense systems used by bacteria were only discovered in the past few years, and it is estimated that many additional anti-phage mechanisms are yet to be discovered [ 4 – 8 ].
Results
In order to find hotspots for integration of defense system-containing mobile elements in the E. coli pan-genome, we examined 1,351 E. coli genomes downloaded from the Integrated Microbial Genomes (IMG) database [23]. Each genome was scanned for regions containing genes involved in anti-phage defense, searching for mobile regions present in some genomes but missing from others (Methods, Fig 1A). We then mapped these mobile regions to the reference genome of E. coli strain K-12 MG1655, a commonly used laboratory strain whose genome is well characterized. These defense system-containing mobile islands mapped to 41 discrete hotspots, most of them empty (unoccupied) in the reference E. coli K-12 genome (S1 Fig and S1 Table).
PPT PowerPoint slide
PNG larger image
TIFF original image Download: Fig 1. Schematic of the defense island search approach employed in this study. (A) Regions containing defense systems in 1,351 E. coli genomes were mapped to the E. coli K-12 genome based on flanking core genes, identifying hotspots for integration of defense-carrying mobile elements. (B) Each hotspot was then searched for in all other E. coli genomes in order to characterize the hotspot occupancy in the E. coli pan-genome. The accession number of each genomic scaffold in the IMG database [23] is shown. Gray shading indicates conservation of core genes flanking the integration position of mobile islands. Known defense system genes are marked in yellow. GInt, Genomic Island with three Integrases.
https://doi.org/10.1371/journal.pgen.1010694.g001
To understand the occupancy of the 41 hotspots in the E. coli pan-genome, we used the core genes immediately flanking each hotspot in the E. coli K-12 reference genome to map these integration hotspots in the 1,351 downloaded genomes (Fig 1B). With a few exceptions, a given integration hotspot was unoccupied in the majority of genomes in which it was detected, with a median of 8% occupancy per hotspot (Fig 2A and S1 Table). An exception was the type I-E CRISPR-Cas locus at what we defined as hotspot #7 (S1 Fig), which appears to be part of the core genome of E. coli and is not found on a mobile genetic element [24]. This locus was present in ~70% of the genomes that we analyzed, while in the remaining ~30% it was degraded, explaining why it was identified as a variably occupied hotspot in our initial analysis. Another locus that was often occupied was the type I restriction-modification (RM) locus at hotspot #36, which was flanked by a transposable element and occasionally included additional defense systems such as Druantia and type IV RM systems (Fig 2A and S2 Table).
PPT PowerPoint slide
PNG larger image
TIFF original image Download: Fig 2. Occupancy of defense hotspots. (A) Bar graph showing the occupancy of each integration hotspot among 1,351 E. coli genomes analyzed. The number above each bar indicates the number of genomes in which the hotspot was found occupied. “Hotspot not found” (gray) indicates that one or both core flanking genes were not found in the relevant genome. (B) Nature of mobile genetic elements (MGEs) integrated at hotspots identified in this study. Multiple, analysis of genes in the integrated element suggests a combination of multiple types of MGE.
https://doi.org/10.1371/journal.pgen.1010694.g002
As expected from a recent analysis of the E. coli pan-genome [25], the defense system-containing mobile islands that we found mostly consisted of well characterized MGEs including prophages, phage satellites, transposons, integrative conjugative elements (ICEs), and integrative mobilizable elements (IMEs) (Fig 2B and S2 Table). Prophages were the most abundant MGE type carrying defense systems (Fig 2B).
Many of the hotspots that we identified were previously described as integration positions for known MGEs. Specifically, 18 of the hotspots were within tRNA loci in the E. coli genome, which are commonly used by prophages and other MGEs as integration hotspots [26]. Some hotspots were occupied by only a single type of MGE. For example, hotspot #29 was occupied only by phages of the Felsduovirus taxonomy that integrated within the small RNA rybB [27]. We found 109 E. coli genomes in which this hotspot was occupied by similar prophages of the Felsduovirus genus, each of which carried up to two defense systems (S2 Fig and S2 Table). As another example, the Tn7-like transposon Tn6230 integrated only at hotspot #3 between the genes yhiN and yhiM, as previously documented for this family of transposons [28,29]. On the other hand, some hotspots in our dataset were occupied by a diverse variety of MGEs in different genomes. For example, hotspot #9, which occurs within the tmRNA locus, could contain integrated prophages of multiple taxonomical groups, P4-like phage satellites, integrative mobilizable elements, and transposons (S2 Table). This is explained by the widespread use of the tmRNA gene as an integration position for different MGEs, with multiple integrase subfamilies having independently evolved to integrate at this position [26,30].
Many MGEs that carry defense systems cannot independently mobilize between genomes but are rather “parasites” of autonomously mobilizable MGEs. For example, phage satellites are known to carry anti-phage defense systems that they package into capsids of “helper” phages along with their own DNA [5,31,32]. P4-like phage satellites commonly integrated within tRNA genes at hotspots #1, #9, #33 and #37 (Fig 2B and S2 Table). Integrative mobilizable elements (IMEs) are another form of “parasitic” MGE in which defense systems were commonly found (Fig 2B and S2 Table); these MGEs do not constitute full conjugative elements, but hitchhike on other conjugative elements for transfer between species [33]. We found such IMEs carrying defense systems integrated at multiple hotspots (Fig 2B and S2 Table).
Some mobile elements that carry defense systems did not fall into a specific category of commonly known MGEs. Some of these mobile elements were characterized by genes annotated as “phage integrase”, or multiple genes annotated as integrases or recombinases, but no additional phage genes were detected in the island (Fig 3A–3C). The presence of these islands in only a subset of genomes suggests that they are somehow mobile, but it is not clear how such elements can mobilize between genomes in the absence of additional known mobility genes (Fig 3A–3C). It is possible that these islands constitute yet unidentified transposons. Indeed, the recently described Tn6571-family transposon termed GInt (Genomic Island with three Integrases), which comprises three putative integrase genes and a small helix-loop-helix protein [34–36], was identified in our analyses as a defense-carrying element integrated at multiple hotspots (Fig 3D and S2 Table). Alternatively, integrase-only defense-carrying islands may represent yet uncharacterized types of “satellite” elements that parasitize other MGEs for their mobilization.
PPT PowerPoint slide
PNG larger image
TIFF original image Download: Fig 3. Examples of mobile islands carrying defense systems with unclear mechanism of mobilization. These islands typically contain integrase or recombinase genes but lack other known mobility genes. (A) Selected examples of integrase-only mobile elements integrated at hotspot #2. (B) Selected examples of integrase-only mobile elements integrated at hotspot #33. This hotspot is occupied in E. coli K-12. (C) Selected examples of hotspot #23, comprising defense systems associated with multiple integrases. (D) Selected examples of hotspot #39 occupied by GInt, a newly described Tn6571-family transposon [34–36]. Gray shading indicates conservation of core genes flanking the integration position. RM, restriction-modification; Gabija, Hachiman and Zorya are defense systems described in [6]; AVAST was described in [4]; CBASS was described in [13]; Olokun, Menshen, and PsyrTA were described in [7]. Gene symbols of flanking core genes are indicated for each hotspot.
https://doi.org/10.1371/journal.pgen.1010694.g003
Overall, we detected 87 types of known defense systems in the hotspots identified in this study (Fig 4). We found that the same type of MGE can carry different sets of defense systems when integrated in different genomes (Figs 3, S2 and S3 and S2 Table). Felsduovirus prophages integrated at hotspot #29, for example, contained a large diversity of defense systems at a dedicated position in the phage genome (S2 Fig). Similarly, a dedicated position within an ICE element that preferentially integrates at hotspots #13 and #14 could contain CBASS, Gabija, Hachiman, Lamassu, retron, and additional systems (S3 Fig and S2 Table). Indeed, it was previously demonstrated that phages and other MGEs carry defense systems at dedicated locations in their genomes [5,8,18].
Some types of defense systems showed preference to be carried by a specific type of mobile genetic element, or to be integrated at specific hotspots (Fig 4). For instance, the bstA gene, which encodes an abortive infection protein, was found only in prophages integrated at hotspot #32 (Fig 4). This gene is naturally silenced by a cognate anti-BstA (aba) DNA element, providing defense against multiple phages that lack the aba element [37]. Ten instances of BstA-carrying lambda-like prophages were identified within hotspots, all of which integrated within the tRNAArg gene at hotspot #32. Similarly, the BREX system [11], which appears in 49 islands in our set, was present only at hotspots #1 and #37; the abortive infection system AbiEii [38] was found only at hotspots #5, #33, and #37; and mobile genetic elements carrying the Wadjet and Zorya defense systems showed preference for integration at hotspot #37 (Figs 4 and 5).
Hotspot #37 was found to contain an exceptionally high diversity of defense systems (Figs 4–6). This hotspot was occupied in nearly all (97.3%) E. coli genomes, and when occupied, it typically (97.4%) contained at least one defense system, with a total of 31 defense system types identified across different genomes (Figs 4–6). This suggests that hotspot #37 represents a genomic position dedicated to defense systems in E. coli. However, the mode of mobilization of these systems between genomes could not be readily determined. While in some cases hotspot #37 contained prophages, P4-type phage satellites or IMEs, the majority (68.0%) of mobile islands carrying defense systems at this hotspot did not have any detectable MGEs, although many contained integrase or recombinase genes (Fig 6 and S2 Table). Notably, a recent study showed that Pseudomonas aeruginosa genomes encode two highly diverse hotspots that seem to be similarly dedicated to carrying defense systems, with some cases showing no identifiable modes of mobilization [39].
To understand the contribution of MGEs to the defense repertoire of E. coli, we next examined 190 E. coli genomes defined as “finished” in the IMG database [23], i.e., their genomes are completely assembled with no gaps. A given finished genome had, on average, 10.2 of the 41 hotspots occupied with an integrated element, but only a subset of these (between one and eight, 4.7 on average) contained known defense systems (S3 Table). Analyzing the defense system content of the main chromosome of each of these genomes using DefenseFinder [40] revealed a total of 1,577 defense systems. Of these, 1,429 (90.6%) were found at the 41 hotspots mapped in the current study (S3 Table). Defense systems frequently (58.9% of cases) co-localized with at least one other system on the same island (S3 Table), conforming with the previously observed tendency of defense systems to genomically co-localize [6,9,10], but also showing that defense systems frequently appear alone [8,40]. Together, these data suggest that the overwhelming majority of the chromosomal defense system repertoire of the E. coli pan-genome is carried on mobile genetic elements that preferentially integrate at a discrete set of defined genomic positions.
[END]
---
[1] Url:
https://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1010694
Published and (C) by PLOS One
Content appears here under this condition or license: Creative Commons - Attribution BY 4.0.
via Magical.Fish Gopher News Feeds:
gopher://magical.fish/1/feeds/news/plosone/