Yeast strains and plasmids
The S. cerevisiae strains used in this study were BY4741 (MATa his3 leu2 met15 ura3). The standard cloning procedure was performed1 to tag the C terminal of target protein with spytag and 3xFLAG. The strains and plasmids are available upon request.
Cell culture, fixation, and permeabilization
Fresh colonies of yeast strain were grown in YPD until OD600 of ~0.5 (10 mL culture). Cells were then fixed by 1% w/v formaldehyde (Thermo Scientific, 28908) at 30 °C for 30 min with gentle shaking. Cells were then harvested and washed by buffer B (1.2 M sorbitol/0.1 M sodium phosphate, pH 7.4) three times. The cells were spheroplasted using 100 µg zymolase (Zymo Research, E1006) and 10 µL fresh beta-mercaptoethanol in 1 mL of buffer B cell suspension for 10 min at 37 °C with gentle shaking. After the spheroplasting reaction, the cells were gently washed with buffer B three times. Cells were post-fixed in 1% w/v formaldehyde in 1X PBS/0.6 M KCl for 30 min at RT. Cells were washed with buffer B three times again after post-fixation.
Spycatcher-DNA oligo conjugate synthesis
The strategy for synthesizing spycatcher-DNA oligo conjugate is shown in Supplementary Fig. 1. Spycatcher with 6xHis-tag and a cysteine sequence at C terminal was expressed in the derived BL21 strain (NEB, C2566H, T7 express competent E. coli) and purified using standard Ni-NTA purification method. To prepare spycatcher-methyltetrazine, spycatcher was reduced by TECP (Thermo Scientific 77720) to remove the potential intermolecular disulfide bond. Excessed TCEP was then removed by PD-10 desalting column (GE Healthcare). The spycatcher was reacted with maleimide-(PEG)4-methyltetrazine (Click Chemistry Tools, 1068-10) via a free thiol group in the reduced cysteine residue and the reaction product (spycatcher-methyltetrazine) was separated from unreacted maleimide-(PEG)4-methyltetrazine by PD-10 column. To prepare trans-cyclooctyne (TCO)-oligo, 5′-amine-modified oligonucleotide (IDT DNA) was reacted with TCO-(PEG)4-NHS ester (Click Chemistry Tools, A137-2) and the reaction mixture was purified by HPLC using a C8 column. Finally, to prepare spycatcher-DNA oligo conjugate, spycatcher-methyltetrazine was reacted with eq. molar amount of TCO-oligo via the click chemistry between methyltetrazine and TCO (Supplementary Fig. 1a). Spycatcher-oligo conjugate was purified from unreacted spycatcher and TCO-oligo by ion-exchange chromatography (Supplementary Fig. 1b) and stored with 50% glycerol in 1X PBS at −20 °C until further usage.
In situ DNA oligo tagging
10 μM spycatcher-DNA oligo conjugate was reacted with the fixed cells in 1X PBS/0.6 M KCl solution containing protease inhibitor cocktail (Sigma Aldrich, SRE0055). The formaldehyde fixation and spheraplasting process permeabilized the cells so that spycatcher-oligo conjugate could enter the cells and react with protein with spytag. The reaction was incubated for 2 h at RT with gentle shaking. After the spycatcher-oligo reaction, the cells were washed with buffer B three times.
Pool-split combinatorial barcoding with T7 ligation
Cells after in situ DNA tagging were distributed into a 96-well plate. Each well contains ~106 cells. T7 ligation reaction buffer containing T7 ligase (NEB, M0318S), 1st round ligation adapter (5 µM) and 1st round barcoding oligos (5 µM) were added into each well. The plate was incubated for 2 h at RT with gentle shaking. After 1st round barcode ligation, cells were pooled together, washed with buffer B three times, and distributed into another 96-well plate. T7 ligation reaction buffer containing T7 ligase, 2nd round ligation adapter (5 µM) and 2nd round barcoding oligos (5 µM) were added into each well. The plate was incubated for 2 h at RT with gentle shaking. All barcode sequences used in this work are acquired from NEB-Next 96 single index kit barcode sequences (NEB, E6609), listed in Supplementary Data 2. After 2nd round barcode ligation, cells were pooled together and washed with buffer B three times. The cell morphology was checked under the microscope after spycatcher-oligo conjugation, 1st ligation, and 2nd ligation to make sure the cells remain intact during this procedure (Supplementary Fig. 3). The cell density was measured using a hemocytometer and a cell-suspension solution containing 900 cells was aliquoted using flow cytometry.
For “dummy” sample preparation, we first synthesized spycatcher-DNA oligo conjugate with the dummy sequence using the same method as described previously. Then cells were reacted with the spycatcher-dummy oligo, ligated sequentially with 1st round barcode oligos and 2nd round barcode oligos using the same methods as before, but without pool-splitting. The dummy sample has different sequence in the PCR handle parts so that it will not be amplified by primers for Illumina sequencing library preparation (Supplementary Fig. 4a). In addition, the 3′ end of 2nd ligation oligo is modified with a rhodamine dye TAMRA, to enable visualization of the ligation bands in gel analysis by a typhoon scanner (Supplementary Fig. 4b, c). The dummy sample was mixed with the aliquot of real barcoded sample (~900 cells) for further analysis.
Gel electrophoresis and protein–DNA complex recovery
2X Laemmli buffer (Bio-Rad, 1610737) was added to the cells (containing both dummy cells and barcoded cells) and boiled at 95 °C for 10 min. This boiling process reversed the formaldehyde crosslinking. The sample was then loaded in a 10% dissolvable polyacrylamide gel. The dissolvable PAGE gel was made with a labile crosslinker, ethylene-glycol-diacrylate (EDA) (Sigma Aldrich, 41608), which allows for high recovery yield from the gel2. The target protein-oligo conjugate bands were visualized using a Typhoon scanner to image with TAMRA fluorescence. The bands were cut off from the gel, and the protein–oligo complex were recovered. We also cut and extracted a blank gel piece (Supplementary Fig. 5a) to estimate the background introduced during gel electrophoresis.
Library preparation and sequencing
Two rounds of PCR amplification were carried out for next-generation sequencing library preparation. 10% of the materials recovered from the gel was used for PCR amplification. First, the DNA part of the protein–DNA conjugate was amplified using its PCR handle. Then in second-round PCR, sequencing adapters were appended using NEB-Next Multiplex Oligos for Illumina (NEB). The amplification conditions for the first-round PCR were as follows: 95 °C 1 min, then 10–15 cycles at 95 °C, 10 s/62 °C, 15 s/65 °C 30 s, and a final extension at 65 °C 3 min. The number of cycles required for the first-round PCR was determined by analyzing a small aliquot of the sample on a qPCR machine. The number of cycles was determined as the start point of exponential phase amplification. The PCR amplification condition for the second-round PCR was as follows: 95 °C 1 min, then 4 cycles at 95 °C 10 s, 62 °C 15 s, 65 °C 30 s, and a final extension at 65 °C 3 min. After each round of PCR, PCR amplicons were separated on 3% agarose gel and purified using gel extraction kit (Thermo Scientific, K210012) without heating. The PCR-amplified library was quantified using a Qubit High-sensitivity DNA kit (Invitrogen). The final purified amplicons were sequenced using a HiSeq 2500 (Illumina) with the targeted read depth of 5–25 million per gel band.
To estimate the “collision” rate (the number of barcodes representing more than two cells), we simulated the sampling process (Supplementary Data 2) using the procedure described in the previous work3. We found that with 9216 possible barcode combinations, the sampling of 900 cells will result in an expected collision rate lower than 5%. Therefore, we aliquoted 900 cells in the experiment for the following analysis.
The sequencing reads were first filtered based on the constant fixed region in the oligo (the constant region includes the PCR handle, the first T7 ligation site, and second T7 ligation site). Reads that had more than one mismatch against the constant region were disregarded. Then, the 1st round cell barcode and 2nd round cell barcode were connected together to generate the full cell barcode. Reads with cell barcodes which did not match the set of barcode combinations (9216 in total) were disregarded. The number of reads for each barcode was then calculated and the real-cell barcodes were identified from spurious cell barcodes as the former have a much higher number of reads than the latter (Supplementary Fig. 5b). Although the real barcodes could be found from both H2B sample and H2Bub sample, they cannot be found from the background sample (Supplementary Fig. 5c). In addition, the number of unique UMIs is significantly lower in the background band compared with the targeted protein band, indicating the gel background is low.
To verify that the UMIs had enough coding space to encode all the proteins in single cells, we counted how many unique UMIs we could identify from sequencing results when we computationally shortened the UMIs (Supplementary Fig. 6a, b). The number of UMIs increased with the length of the UMIs and reached a plateau after around 10 nt, indicating that the length of UMI (12 nt) have enough coding space to encode all proteins in single cells. To verify that the sequencing depth was high enough to sample all the UMIs, we computationally subsampled the sequencing reads and calculated how many UMIs observed were associated with single-cell barcodes (Supplementary Fig. 6c). As sequencing depth increases, the number of uniquely identified UMIs increases and reached a plateau at full sequencing depth (1.0), indicating that all the UMIs are sufficiently sampled. It should be noted that different sequencing depths were needed for different proteins to saturate the UMIs. For example, for the H2B sample, 25 million reads were needed, while for the H2Bub sample, only 5 million reads were required for library saturation. This reflects the different complexity of these two libraries, which agrees with the different copy numbers of these two proteins inside the cells.
Statistics and reproducibility
ImageJ software was used to analyze western blot images. Single-cell protein copy number data were processed using Microsoft Excel. Results were shown as mean ± S.D. All statistical analyses were depicted in the figure legends. Statistical significance was assessed using Welch’s t-test. p-values of 0.05 or less were considered statistically significance and absolute p-value is presented in figures.
Further information on research design is available in the Nature Research Reporting Summary linked to this article.