# Assembly of a patchy protein into variable 2D lattices via tunable multiscale interactions

Jul 28, 2020

### Protein purification and mutagenesis

C98RhuA and F88/C98RhuA were purified following established protocols (see refs. 7,23 for alternative descriptions). Briefly, for each protein, a pJ414 plasmid containing the RhuA coding sequence was transformed into BL21 (DE3) E. coli cells (New England Biolabs; Catalog #C2527I) via heat-shock, grown to high density in LB + 100 mg∙mL−1 ampicillin, overexpressed by overnight 1 mM IPTG induction, pelleted, resuspended in 20 mM Tris-HCl (pH 7.5) + 10 mM β-mercaptoethanol (βME), and lysed by sonication. The resulting solution was clarified by centrifugation (5000 rpm, 15 min), treated with 1.5% Polymin-P, reclarified, and purified via NaCl step gradient on a DEAE gravity column at 4 °C. Peak fractions were pooled and RhuA was precipitated using 1.7 M (NH4)2SO4, gently stirred for 30 min, then separated by centrifugation. The precipitate was dialyzed into 20 mM sodium acetate (pH 5) + 10 mM βME, exchanging 3–4 times over 3 days. The dialysate was sterile filtered, loaded onto S columns via FPLC and purified via NaCl gradient. RhuA elutes at ~200 mM NaCl for both columns. Peak fractions (>90% purity) were pooled prior to concentration and storage.

Overexpression and purification of S98RhuA was carried out analogously to C98RhuA except for the omission of βME in the purification buffers. All purified proteins were dialyzed into 20 mM Tris-HCl (pH 7.5) and 10 mM reduced L-glutathione (GSH), concentrated to 100–150 μM, flash-frozen in liquid nitrogen, and stored at ≤ −60 °C. The plasmid for S98RhuA was generated from the C98RhuA parent plasmid via site-directed mutagenesis using the following primers:

RhuA S98 Forward: GTTAAGGTGGATAGCAGCGGTGCAGGTTACCACATCC.

RhuA S98 Reverse: GGATGTGGTAACCTGCACCGCTGCTATCCACCTTAAC.

### Solution self-assembly of C98RhuA

Crystallization of C98RhuA was induced via hand-thawing of frozen RhuA aliquots, which were then placed on a shaking platform at 4 °C and allowed to mature. Nucleation typically occurred within 3–7 days, and crystals fully matured over 2–3 weeks, consistent with previous reports7,23. Crystal suspensions were clarified 2–3x by low-speed (ca. 3000 rpm) centrifugation in a benchtop centrifuge followed by replacement of the supernatant with fresh buffer to remove unincorporated proteins. This procedure facilitated the binding of p4212 crystals onto the mica substrate, but also increased the population of stacked 2D crystals. The suspension of p4212 crystals was diluted to 25 μM using 10 mM Tris-HCl (Sigma-Aldrich) pH 7.0 prior to imaging by AFM. 10 μl of diluted crystal suspension was incubated on polylysine-treated mica for 10 min, then rinsed with 10 mM Tris-HCl buffer prior to imaging. Polylysine-treated mica was obtained by incubating 10 μl polylysine solution (0.01%; Sigma-Aldrich) on freshly cleaved mica for 1 min, rinsing with water, and drying under nitrogen prior to use.

### Mica-templated self-assembly of RhuA

Frozen stock solutions of C98RhuA were thawed at room temperature and diluted to the final concentration with incubation buffer (10 mM Tris-HCl pH 7.0, 1–10 μM GSH, and desired concentration of KCl, RbCl, MgCl2, or ZnCl2). 100 μl of diluted protein solution was deposited onto freshly cleaved mica and incubated for 24 h in a sealed petri dish at room temperature. The mica surface was rinsed with fresh incubation buffer prior to imaging by AFM. GSH, KCl, RbCl, MgCl2, and ZnCl2 were purchased from Sigma-Aldrich, nuclease-free water from Ambion, and muscovite mica from Ted Pella.

### Atomic force microscopy

All AFM images of mica-templated and solution-grown C98RhuA crystals were captured in PeakForce TappingTM mode on a MultiModeTM VIII AFM (Bruker, CA) using HYDRA4V-100NG or HYDRA6V-100NG (AppNANO, CA) probes in the incubation buffer used for each specific sample. The peak-force set-point was continuously adjusted to minimize any possible manipulation or damage from probes. The effective imaging force ranged from 100 to 200 pN, within the typical force range for AFM imaging of biomolecules51,52. All offline data processing was done using the SPIPTM software package (Image Metrology, Denmark).

### Simulated AFM topographs

AFM topographs were calculated from atomic models with different monomer configurations (extracted from equilibrated 2 × 2 RhuA crystals; see below) using a custom Tcl script executed within VMD53, taking the mica surface to be the plane z = 0. Accordingly, crystal models were recentered at (x = 0, y = 0) and moved along the +z axis until the minimum z position of the protein Cα atoms was at 0. The probe tip was modeled as a sphere with radius 10.0 Å and center ztip at the end of a cone with half-angle 20° and truncated at zcutoff (see Supplementary Fig. 16). Parametric equations which define the volume enclosed by this tip shape were determined and used to check for overlaps with any protein heavy atoms within the conical or spherical volumes for a given (x, y, ztip). This was critical to capture tip convolution artifacts arising from the finite size of the tip, not simply the tip radius. Tapping was simulated by moving the probe to (x, y, radius) and increasing ztip until the number of overlaps was 0 (or below a given threshold to account for the flexibility of disordered protein loops). The height for that xy position was then recorded as (zmin = ztip – radius). This calculation was repeated row-by-row, with the resolution determined by Δx = Δy = 0.50 Å, to generate a 2D array of height values analogous to a true AFM topograph. Output was directly visualized and processed using Gwyddion. This code (and associated utilities/example files) can be accessed at https://doi.org/10.5281/zenodo.3924893.

### Evaluation of the C98RhuA dimerization free-energy landscape

The C98RhuA dimer structure was prepared from the A88FRhuA crystal structure (PDB ID: 2UYU) by performing the corresponding D98C and F88A mutations using PyMOL54. PSFGEN53 was used to add missing hydrogens and assign atom types. The final dimer structure was centered at the origin with the axis of symmetry aligned along the z-axis, then each monomer was translated ±10 Å along the z-axis to yield a starting distance between each protein’s center of mass (COM separation) of 65.5 Å. The separated dimer was then solvated with 45,082 water molecules (CHARMM TIP3P) and either neutralized with 32 K+ ions (no KCl starting structure) or 2302 K+ ions and 2270 Cl ions (3 M KCl starting structure). The CHARMM36 force field55 was used for protein atoms and Joung-Cheatham parameters56 were used for monovalent ions. Both systems were minimized for 5000 steps with all protein atoms held fixed, followed by a 5000 step minimization without restraints. Protein Cα atoms were constrained to their starting positions by a 10 kcal∙mol−1 ∙ Å−2 restraint for equilibration. The systems were equilibrated for 1 ns in the isobaric-isothermal (NPT) ensemble (1 atm, 300 K) with the cross-sectional xy proportions held constant, yielding final box dimensions of 104.7 × 104.7 × 145.6 Å. Monomers were then linearly pulled towards each other to a final COM-COM distance of 45.5 Å over 5 ns using a 100 kcal∙mol−1 ∙ Å−2 moving restraint, with the xy coordinates of the Cα remaining constrained to prevent rotation of the monomers. Initial coordinates for umbrella sampling windows were extracted from this pulling simulation and maintained with weaker force constants (see Supplementary Table 1 for details). All windows were equilibrated for 25 ns, of which the last 10 ns were used for calculation of the PMF using the WHAM algorithm57. The 100 kcal∙mol−1 ∙ deg−2 harmonic restraints were employed during sampling to prevent rotation of each monomer about their axis of symmetry in order to preserve their relative orientations from the F88RhuA crystal structure, which simplifies the dimerization coordinate to 1D (COM separation along the z-axis). Errors in the free energy estimates were calculated using the block averaging method58. PMF simulations were carried out using NAMD 2.1259.

### Modelling and simulation of protein binding to m-mica

Parameters and initial coordinates for muscovite mica were taken from the INTERFACE force field package60. A 5 × 3 m-mica supercell with Al substitutions in agreement with 29Si NMR data was then tiled to form a 5 × 5 array of dimensions 129.795 × 135.230 × 19.9452 Å. Periodic bonds were reorganized using TopoTools61 within VMD to yield a “full-size” single layer of m-mica corresponding to a 25 × 15 × 1 supercell of the crystallographic m-mica unit cell. Copies of this full-size layer were then translated along multiples of the vector (2.005, 0.000, 19.9452 Å) to yield a “5stack” bulk-like m-mica structure of final dimensions 129.795 × 135.230 × 99.726 Å, which was then merged into a single structure using TopoTools. This system was equilibrated at constant pressure without minimization in a fully-flexible orthorhombic periodic cell for 100 ps using the particle mesh Ewald method for full-system electrostatics. The two outermost layers were then removed from the equilibrated structure to yield the final starting m-mica coordinates (appx. 60 Å in height), with all surface vacancies on both sides of the mica occupied by K+ ions. These ions were free to exchange with the solution, and at equilibrium 80–85% of sites were occupied by K+ ions. This state was also attained if the system was initiated with a bare (K+ occupancy = 0) m-mica surface and the same numbers of ions in solution, confirming that it reflects thermodynamic equilibrium.

A pre-equilibrated C98RhuA structure was recentered at the origin and solvated in a pre-equilibrated 128.8 × 134.0 × 129.9 Å box (68,667 waters) prior to neutralization with 16 K+ ions. An additional 3463 K+ ions and 3463 Cl ions were added, bringing the [KCl] to ca. 2.67 M (in 61,725 waters). This system was minimized for 1000 steps with all protein atoms fixed and subsequently equilibrated in the NPT ensemble (1 atm, 300 K) for 1 ns keeping the xy area constant. The final box dimensions were 128.8 × 134.0 × 125.1 Å.

The equilibrated protein system was merged with the equilibrated mica layers into a single structure using TopoTools such that the lowest position of protein Cα atoms on the C-terminal face was 25 Å above the average z coordinate of the bridging oxygen atoms on the mica surface (henceforth treated as position z = 0). Overlapping waters and ions were translated along the +z direction by the periodic c vector from the last frame of the protein simulation to minimize bad contacts. Finally, 750 Cl ions were added to the solution to neutralize the system, bringing the solution ion concentrations to ~2.67 M K+ and 3.33 M Cl, close to experimental conditions. Solution-exposed K+ ions on both faces of the mica slab were converted to the Joung-Cheatham atom type to reflect their occupancy by solution-phase ions, while K+ ions on the interior retained their typing as prescribed by INTERFACE. The resulting system was minimized for 100 steps to remove bad contacts then equilibrated for 1 ns in the NPT ensemble, keeping the periodic a and b vectors constant at the equilibrated mica dimensions of 129.25 Å and 134.45 Å, respectively. All mica atoms were lightly constrained (1 kcal∙mol−1 ∙ Å−2) to their initial positions to prevent drift of the slab. The protein was restrained (by its Cα atoms) to its initial COM height (z = 50.0 Å relative to the mica surface oxygens), xy position, and angle of rotation using 100 kcal∙mol−1 restraints during initial equilibration. The average c vector over the last 50 ps of equilibration (185.5 Å) was then used to define the periodic cell dimensions for all subsequent runs. Mica hydrogen atoms were then released from constraints and the system was equilibrated for a further 2 ns at constant volume. In order to qualitatively determine whether C98RhuA adsorption occurs directly at the surface or above the ionic slipping plane, the former configuration was obtained by pulling the protein 20 Å towards the mica surface over 5 ns using a 100 kcal∙mol−1 ∙ Å−2 moving restraint, such that the minimum Cα position was at z ≈ 5 Å.

The protein was allowed to adopt a preferred binding configuration (height, rotation) by releasing the COM restraints on the protein z position and rotation angle. Instead, a flat protein orientation was maintained in an unbiased fashion by restraining the difference in Cα COM z positions of diagonal protein chains (A-C and B-D) at 0 Å using two 100 kcal∙mol−1 restraints; this allowed the full protein to move freely while preserving the relative positions of its individual subunits. Existing constraints on the protein xy position and mica heavy-atom coordinates were preserved to simplify the adsorption/desorption pathway. Sampling was carried out over 10 ns of equilibration, with the protein initially relaxing away from the surface by ca. 3 Å, indicating that water and ions interact more strongly with the surface and prevent direct adsorption of the protein.

### APBS calculations

Continuum electrostatics calculations of the C98RhuA protein (appropriately protonated using PROPKA 3.162) were carried out using APBS 1.563 to reflect experimental solution self-assembly conditions (pH 7.5, [ions] = 20 mM, εsolvent = 78, εprotein = 2).

### Equilibration of 3D periodic crystals

Open-state 2 × 2 C98RhuA crystals were constructed as previously described23 for both p4 and p4212 symmetries. These crystals were then crosslinked to their periodic images via disulfide bonds, solvated in a 200.0 × 200.0 × 114.0 Å box (120,609 and 120,531 waters), and neutralized with Na+ ions (Joung-Cheatham). Initial coordinates were first minimized for 2000 steps with all protein atoms held fixed, followed by another 1000 step unrestrained minimization. The systems were then equilibrated for 5 ns (followed by 2 ns of production sampling) at constant pressure, keeping the xy-dimensions of the box held constant (final equilibrated periodic c dimension ≈ 104.5 Å). 100 kcal∙mol−1 · protein−1 restraints were employed to maintain the difference in Cα COM xy and z positions of diagonally opposed proteins at 0 Å, preserving the structure of the crystal in a minimally biased fashion. The coordinates for the open-state 2×2 p4 crystal (extracted after 1 ns of equilibration) were manipulated to generate all other p4 lattice conformations for the piezoelectricity simulations via rigid-body rotations/translations of the individual protein units to their idealized dimensions (while preserving the periodic bonding topology). Each conformation was solvated in its own periodic water box of dimensions a × b × 114.0 Å (where a = b = the xy unit cell dimensions of that conformation) to maintain the periodic disulfide bonding within the crystal plane and equilibrated at constant pressure as described above. Analogous simulations were carried out for F88/C98RhuA (final periodic box size: 182.5 × 182.5 × 124.8 Å). The average protein coordinates over the last 1 ns of simulation served as the model structures for all simulated tapping calculations.

### Calculation of the C98RhuA macrodipole moment

The dipole moment of C98RhuA was calculated from equilibrated protein structures using the VMD “measure dipole” command and visualized using the dipole watcher plugin, both of which yielded values of ca. 1200 D. Hydrogens and Zn ions were excluded. We verified the magnitude of the vector using the Protein Dipole Moments Server10 using PDB IDs: 1OJR, which yielded a value of 310 D per protein subunit (1240 D total) and 1GT7 (1134 D, full tetramer). We used 1200 D for all dipole potential energy calculations as a representative (and reflective of the equilibrated protein structure) estimate.

### Electret/piezoelectric simulations

The last frame of the equilibrated p4 and p4212 3D periodic crystal simulations were used as starting coordinates for slab-geometry electret simulations. The protein crystals were recentered such that their disulfides lay at z = 0, and solvent/ions were translated across periodic c vector until there were equal numbers of water molecules on each side (within tolerance of <0.1% difference). Ions were added to each side of the disulfides to concentrations of 200 mM NaCl such that there were equal numbers of each type of ion on both sides. Systems with neutral net charge and without neutralizing ions (−64 e) were constructed to test the effect of system neutrality on membrane potential (none observed). Protein Cα atoms were constrained to their starting z position using a 100 kcal∙mol−1 restraint to maintain an equal number of waters on each side over the course of the simulations. Equilibration was carried out for 10 ns using constant-area 2D periodic boundaries (a = b = 200.00 Å for p4212 lattices, and spanning 151.19–200.00 Å for different p4 lattice conformations; see Supplementary Table 2 for details), with full-system electrostatics calculated using the Multiscale Summation Method64 in NAMD 2.13, which is compatible with both 2D periodic boundaries as well as non-neutral systems. This permitted the calculation of membrane properties using a single crystal by avoiding the problem of ion diffusion back across the periodic boundary, which otherwise would require a double membrane system consisting of >900,000 atoms. Gentle boundary conditions (1 kcal∙mol−1 potentials) were enforced above and below the liquid phase to prevent the evaporation of water molecules using the TclBC module of NAMD. Simulations were carried out for 10 ns to ensure full equilibration.

### Disulfide dihedral energy calculations

Disulfide dihedral angles (as depicted in Supplementary Fig. 2) for the nonperiodic disulfide bonds of p4 and p4212 crystals were extracted from the 2 ns (2000 coordinates) of production sampling from the 3D periodic crystal simulations. The potential energy associated with each disulfide bond conformation was calculated according to the empirical formula described in ref. 65.

### Numerical C98RhuA electrochemical potential calculations

To directly calculate the steady-state voltage drop arising from ion segregation across RhuA crystals, volumetric electrostatic potential maps (exclusively due to ions) were computed for all 2D electret simulations using the PMEpot66 plugin in VMD for all 10 ns of sampling (10,000 configurations), split into 100-frame blocks. The maps were calculated over a × b × 200 Å volumes (a = b = 151.19–200.00 Å depending on lattice conformation) at 1 Å intervals, incorporating the influence of periodic images in the xy dimensions, while remaining pseudo-non-periodic in the z dimension by providing an appx. 100 Å air gap between periodic images. An Ewald factor of 0.25 Å−1 was used to smooth the charge potentials. The linear voltage drop across the crystal was quantified by integrating over the xy voxels within each 3D volume and projecting the average potential onto the z dimension using a Python script. The difference in electrostatic potential due to the ion distribution across the membrane was calculated, conservatively scaled by the dielectric of pure water (ε = 78), and multiplied by the factor 25.87 mV ∙ (kBT)−1 for a 300 K simulation. The 5–10 ns time-averaged membrane potentials for p4 and p4212 crystals were −56.09 ± 5.71 mV and −3.09 ± 5.85 mV, respectively. This potential for p4 crystals is in very close agreement with the potential calculated analytically (−57.96 mV) for open-state p4 crystals based purely on the density of dipoles within the lattice (see Supplementary Discussion), strongly suggesting that the macrodipole moment of RhuA crystals can indeed account for their predicted bulk properties.

### Pairwise C98RhuA nanoparticle electrostatic potentials

The electrostatic potential between individual C98RhuA proteins was computed using the pairwise potential between two charged nanoparticles (Uij) proposed by Phillies29 for dilute solutions of proteins and charged colloids (Eq. 1), as employed previously for modelling nanoparticle and protein interactions26,28,67.

$$U_{ij}left( {r_{ij},theta _{i,j},varphi _{i,j}} right) = , k_efrac{{q_iq_j}}{{r_{ij}}}e^{ – kappa r_{ij}}C_0^2 + k_efrac{{q_imu _jcos theta _j + q_jmu _icos theta _i}}{{r_{ij}^2}}e^{ – kappa r_{ij}}C_0C_1 \ + k_efrac{{mu _imu _j}}{{r_{ij}^3}}left{ {cos theta _jcos theta _jleft[ {2 + kappa r_{ij} + (kappa r_{ij})^2} right] \ + sin theta _isin theta _jcos (varphi _i – varphi _j)left[ {1 + kappa r_{ij}} right]} right}e^{ – kappa r_{ij}}C_1^2,$$

(1)

where C0 and C1 are given by:

$$C_0 = frac{{e^{kappa a}}}{{1 + kappa a}},$$

(2)

$$C_1 = frac{{3e^{kappa a}}}{{2 + 2kappa a + left( {kappa a} right)^2 + frac{{left( {1 + kappa a} right)}}{varepsilon }}}.$$

(3)

Here, qi and qj are the net charge of each C98RhuA protein (−16e), μi and μj are the dipole moments of each C98RhuA protein (1200 D), a the particle radius (chosen to be 4.0 nm, the radius for a circle of equivalent area), and rij is the center-center distance between particles. The angles θi,j are the dipole-rij angle for each protein, while (φiφj) is the dihedral angle between the two dipoles about the rij vector (Δφij). ke is Coulomb’s constant and given as (4πε0ε)−1; ε0 the permittivity of vacuum; ε the relative solvent permittivity (78); and κ−1 is the Debye screening length (2.1508 nm, corresponding to a solution of 20 mM ions).

The requirement of lateral disulfide bonding within the 2D plane of C98RhuA crystals enables Eq. 1 to be simplified considerably. It geometrically restricts the dipole moment of each particle pair to be oriented normal to the lattice (so θ = 90°), and Δφij = 0 (p4 symmetry) or 180 (p4212 symmetry). We can then reduce all sin(θ) terms to unity, and nullify all cos(θ) terms (comprising the entire charge-dipole term and first half of the dipole-dipole term). This gives us the reduced form of Uij (Eq. 4):

$$U_{ij}left( {r_{ij},Delta varphi _{ij}} right) = k_efrac{{q_iq_j}}{{r_{ij}}}e^{ – kappa r_{ij}}C_0^2 + k_efrac{{mu _imu _j}}{{r_{ij}^3}}cos (Delta varphi _{ij})left[ {1 + kappa r_{ij}} right]e^{ – kappa r_{ij}}C_1^2.$$

(4)

We can see that the first term is purely repulsive due to the identical net charge on both C98RhuA proteins, while the dipolar interaction is dependent on the pair dihedral angle, making it repulsive for p4 crystals (cos(0) = 1) but attractive for p4212 crystals (cos(180) = −1). The dihedral angle can also be allowed to vary continuously (i.e., accessible to a newly-attached monomer to the growing lattice edge), and results in an energy funnel towards the minimum energy configuration of antiparallel packing (Supplementary Fig. 3). This illustrates the origin of the thermodynamic selectivity for p4212 lattices in solution (Supplementary Fig. 4).

### Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.