1. Introduction
The
Salmonella-phage P22 packages its genome from concatemeric dsDNA using a powerful molecular motor that consists of a large (gp2, 499 amino acids, 57.6 kDa [
1]) and small (gp3, 162 amino acids, 18.6 kDa [
2]) terminase subunits. In vitro studies have been performed to show that gp2 and gp3 form an oligomeric complex [
3]. The recognition subunit, gp3, binds to packaging initiation (
pac) sites [
4] in the P22 genome and positions viral dsDNA to the large terminase subunit gp2, which uses ATP hydrolysis to translocate a single genome copy into empty procapsids. This reaction is very efficient and in similar phages proceeds at rates as high as 2,000 bp/sec [
5]. The nuclease domain of gp2 also cleaves the concatameric dsDNA once the head is filled [
4]. Upon cleavage, the gp3:gp2 complex quickly disassociates from the capsid enabling binding of the tail factors gp4, gp10 and gp26 [
6,
7] that seal the portal protein and stabilize the genome inside the capsid. This is followed by the attachment of six copies of the trimeric tailspike gp9 [
8].
Gp3 spontaneously assembles into nine radially positioned protomers [
9]. Initially, an 18 Å resolution structure obtained by negative stain electron microscopy (EM) revealed a C-terminally cleaved gp3 nonamer [
10]. Subsequently, a crystal structure of 1.75 Å resolution along with the full-length EM structure of 18 Å resolution revealed an outer diameter of 95 Å, and an inner diameter of 23 Å; wide enough to compensate for hydrated dsDNA [
11].
Gp2 consists of an N-terminal ATPase fold (residues 38-284), a flexible linker, a C-terminal nuclease domain (313-482), and a C-terminal basic tail (residues 483-499) [
12,
13,
14]. The ATPase domain is conserved in phages and large-terminase containing viruses. It contains a typical nucleotide-binding fold [
15] with two subdomains comprised of ATP/GTP-binding Walker A and B motifs. The nuclease domain of gp2 has been solved via x-ray crystallography [
16]. It has an RNAse H-fold and is a member of the ribonuclease H/resolvase/integrase superfamily [
17]. The nuclease domain of gp2 has a mixed α/β fold and is globular in structure. The catalytic site has two Mg
2+ ions. One Mg
2+ is octahedrally coordinated with four Asp and four waters, while the other Mg
2+ is tetrahedrally coordinated with Asn, His, and two waters. Catalytic site Asp
321 is conserved in related phages and critical to nuclease activity [
16].
The structure of the gp2 α-helical hairpin residues 1-33 bound to Fab4 has been determined through crystallography [
18], however, the 3-D structure of the P22 gp2 ATPase domain has not been determined to date. Since the fold is conserved, it is likely the 3D structure resembles that of Sf6 [
19] and T4 [
20]. Monomeric gp2 assembles into a pentamer bound to dodecameric portal (gp1) [
21,
22]. Bacteriophage P22 has a symmetry mismatch at one of its 5-fold vertices. This vertex contains the 12-fold portal protein through which genome packaging and ejection occur.
There is a full-length crystal structure for the
Sf6 large terminase. The sequence identity of P22 to Sf6 is 15% and 12% for the corresponding large and small terminases respectively [
23]. Although sequence identity is low for large terminases, 3D homology remains relatively high. Vis-à-vis other large terminases, we expect the 3-D structure of monomeric gp2 to resemble that of other published large terminases.
A negative stain transmission electron microscopy (TEM) reconstruction has been determined for the first P22 terminase holoenzyme complex [
24] which illustrates a stoichiometry of 1(gp3)
9:2(gp2). Not only has a high-resolution cryo-EM structure not been determined since the negative stain reconstruction was reported, but the stoichiometry of the complex of the large terminase complex that associates with dodecameric portal has not been determined. The large terminase from T4 has been shown to form a pentamer and associate with the portal [
25]. In this paper, we use a combination of electron microscopy coupled to mass spectrometry to show that the terminase complex is capable of adopting variable stoichiometric assemblies, which include 5 copies of gp2. These results provide new insights into the biological relevance of heterogeneity observed in this dynamic molecular machine.
4. Discussion
Terminases are transient complexes that package viral genomes into preformed procapsids. After nearly a decade since the first moderate resolution P22 terminase holoenzyme was determined, a high-resolution cryo-EM structure is still lacking. This is likely due to biological heterogeneity observed with this complex illustrated herein. Unlike bacteriophages whose terminase complex is made of 2 proteins (L-terminase and S-terminase), herpesviruses have 3 proteins that comprise the terminase complex (pUL15, pUL28, pUL33). Recently the terminase complex for herpesvirus has been determined and express as monomeric, hexameric, or dodecameric in form [
41]. Thus, the heterogeneity observed for herpesvirus terminases, which express in 3 main stoichiometries, are consistent with that observed herein for P22 terminase complexes.
We found that P22 terminase complex could assemble into 3 main classes: 1(gp3)
9:2(gp2), 2(gp3)
9:5(gp2), 3(gp3)
9:7(gp2). The variability in subunit assembly is supported by native mass spectroscopy, indicating that the observed assemblies are not an artifact of negative stain sample preparation. We propose the differential assembly of these complexes is due to S-terminase C-terminal helices that are free to associate with additional L-terminase ATPase subunits. Subsequent binding of an additional S-terminase nonamer enables additional L-terminase subunits to bind. The C-terminal helices of S-terminase fit the canonical definition of an intrinsically disordered region (IDR) according to several IDP software programs including PONDR (
Figure 6). Moreover, L-terminase also contains several IDR’s that would contribute to conformational and structural heterogeneity observed herein. The intrinsic disorder distribution within the protein sequence of small and large terminase is similar across species (
Figure 6). Previous structural studies show that L-terminase is monomeric in solution but forms a pentamer at the portal vertex [
21,
22]. Based on this, it is likely the holoenzyme with 2(gp3)
9:5(gp2) can directly bind and headful-package concatemeric genomic DNA inside P22 procapsids. Data shows that one or more of the complexes are relevant because they bind procapsid in the presence of gentle glutaraldehyde crosslinking.
A nearly complete structural characterization exists for Bacteriophage P22 ranging from mature viral assembly [
7,
42] to genomic injection conduit formation [
43], with most protein structures and their locations known for the virus. Future experiments for the cryo-EM structure of P22 terminase complex bound to procapsid may be facilitated by supplementing with P22 genomic DNA or gentle crosslinking as suggested herein.