# Measurement of SARS-CoV-2 RNA in wastewater tracks community infection dynamics

Sep 18, 2020

### Sample collection

Primary sewage sludge (40 ml) was collected from the East Shore Water Pollution Abatement Facility (ESWPAF) in New Haven, Connecticut, USA. A total of 73 samples were taken daily from March 19, 2020, to June 1, 2020, between 8:00 and 10:00 EDT, and stored at −80 °C before analysis (samples were not available on May 3 and 6). The first sampling dates were before widespread testing in the region and before the March 23, 2020, stay-at-home restrictions implemented throughout the State of Connecticut. From the sampling start and end dates, cities served by the ESWPAF experienced an increase in confirmed COVID-19 cases (by testing) from seven cases to 3,978 (ref. 15). The plant serves an estimated population of 200,000 people with average treated flows of 1.75 m3 s−1. Sludge collected from ESWPAF is primary sludge, sampled at the outlet of a gravity thickener, ranging in solids content from 2.6% to 5%. The solids residence time in the gravity thickener is 4 h.

### Viral RNA quantitative testing

To quantify SARS-CoV-2 RNA concentrations in primary sludge, 2.5 ml of well-mixed sludge samples were added directly to a commercial kit optimized for isolation of total RNA from soil (RNeasey PowerSoil Total RNA Kit, Qiagen). Two replicate RNA extractions and analyses were performed for one daily primary sludge sample. Isolated RNA pellets were dissolved in 50 μl of ribonuclease-free water, and total RNA was measured by spectrophotometry (NanoDrop, Thermo Fisher Scientific). SARS-CoV-2 RNA was quantified by one-step qRT–PCR using the U.S. Centers for Disease Control and Prevention (CDC) N1 and N2 primers sets16,17. For control and in accordance with the CDC protocol, analysis was also conducted for the human RP gene17, and SARS-CoV-2 results were reported only if RP detection was positive. Samples were analyzed using the Bio-Rad iTaq Universal Probes One-Step Kit in 20-µl reactions run at 50 °C for 10 min and 95 °C for 1 min, followed by 40 cycles of 95 °C for 10 s and 60 °C for 30 s per the manufacturer’s recommendations. SARS-CoV-2 RNA concentrations were determined using a standard curve as previously described17 and presented as virus RNA copies. For the standard curve, complementary DNA synthesized from full-length SARS-CoV-2 RNA (WA1-USA strain) was used as a template to generate SARS CoV-2 N gene transcripts as previously described17. The N gene was amplified, and the PCR amplicon was purified and used for template in the MEGAscript T7 Kit (Thermo Fisher Scientific) to generate single-stranded RNA transcripts. RNA was quantified on a Qubit fluorimeter (Thermo Fisher Scientific) and integrity verified on a Bioanalyzer 2100 (Agilent). Viral RNA copies were calculated, and serial ten-fold dilutions were made. To validate our N1 and N2 primers sets, standard curves using the ten-fold series dilution (5 × 101 to 5 × 108 copies per reaction) of the N gene transcripts were analyzed. The N1 primer set generated a standard curve with an R2 value of 0.98 with an efficiency of 94.1% (slope = −3.473; y intercept = 42.266). The N2 primer set generated a standard curve with an R2 value of 0.99 with an efficiency of 88.5% (slope = −3.632; y intercept = 42.528).

The SARS-CoV-2 concentration results were adjusted to the total RNA extracted by multiplying sample concentrations by the ratio of the maximum RNA concentration to the sample RNA concentration. This accounts for day-to-day variations in sludge solids content and RNA extraction efficiency. To determine whether sludge RNA extracts contained PCR inhibition, target RNA was spiked into three separate sterile, ribonuclease-free water samples (no inhibition) and five different sludge RNA extracts from samples collected at a time in the outbreak when cases were low and viral RNA was not detected with N1 primers. Spiked samples were then diluted 5× and 25×, and sludge RNA Ct values were compared to water RNA Ct values using N1 primers. No differences were observed for average water Ct values and sludge extract Ct vales for the no dilution (P = 0.14), 5× dilution (P = 0.51) and 25× dilution (P = 0.23), two-tailed t-test, suggesting no PCR inhibition in the RNA extracts. All samples were diluted 5× for use as a template to ensure that qRT–PCR inhibition occurred. Sewage sludge from March 2018 was used as a control, and no SARS-CoV-2 detection was observed from either N1 or N2 primers. These control sludges were stored at −80 °C and were consistently positive for the human RP gene. Positive RNA controls and no-template controls were included in all qRT–PCR runs. Appropriate Ct values were observed for all positive controls, and no amplification was observed in negative template controls.

### Epidemiological data

Daily COVID-19 admissions to the Yale New Haven Hospital were compiled from hospital records—adjusted to include only New Haven, East Haven, Hamden and Woodbridge, Connecticut residents—and confirmed by laboratory testing. Hospital data were obtained from the Joint Data Analytics Team for the New Haven Health System. The total number and percentage of tests of residents from the four cities that were positive for COVID-19 and reported by date of specimen collection and date of reporting to the State of Connecticut were supplied through a data request to the Connecticut Department of Public Health. Numbers of laboratory-confirmed positive COVID-19 tests by report date in the towns served by the ESWPAF (New Haven, East Haven, Hamden and Woodbridge, Connecticut) were compiled from daily reports published by the Connecticut Department of Public Health15.

### Statistics

Linear regressions were used to estimate the relationship between SARS-CoV-2 RNA copies per ml results for replicated RNA extractions of each daily sample (n = 73 for each PCR primer). Two-tailed t-tests (α = 0.05) were used in PCR inhibition experiments to determine if spiked sludge RNA extracts resulted in the same Ct values as spiked water samples at no dilution, 5× dilution and 25× dilution, n = 6 for water spiked samples and n = 15 for sludge spiked samples for each dilution.

Estimation of primary sludge as a potential leading indicator was performed using a distributed lag measurement error time series model. This analysis was carried out in the Bayesian framework, allowing us to correctly characterize multiple sources of uncertainty when estimating the lagged associations of interest. In the analyses, we assume that the observed sludge testing data represent unbiased estimates of an underlying, unobserved trajectory of viral concentration in the sludge. We then evaluated the association between the underlying trajectory of viral concentration in the sludge at multiple lagged periods and the number of positive tests (based on date of specimen collection)/percentage of positive tests based on date of specimen collection)/hospitalizations/number of positive tests (based on date of report) using distributed lag Poisson regression models that included a random effect to account for overdispersion and autocorrelation in the outcome, n = 73 for sludge virus RNA samples, n = 75 daily positive tests by date of specimen collection, n = 75 daily percentage of positive tests reported by date of specimen collection, n = 75 daily reported hospitalizations and n = 75 daily reported positive tests. The distributed lag regression parameters were modeled using a random walk process. The models are fit using the rjags package in R18. Mathematical models include the following:

Model for case counts:

$$Y_t|lambda _tsim {mathrm{Poisson}}left( {lambda _t} right),;t = d + 1, ldots ,n – 1$$

$$ln left( {lambda _t} right) = {O}_t + beta _0 + mathop {sum }limits_{j = – 1}^d beta _jx_{t – j} + phi _t$$

$$phi _t|alpha ,phi _{t – 1},sigma _phi ^2sim {mathrm{Normal}}left( {alpha phi _{t – 1},sigma _phi ^2} right);phi _0 equiv 0$$

Model for primary sludge:

$$W_{tj}sim {mathrm{Normal}}left( {x_t,sigma _{it{epsilon }}^2} right),quad j = 1, ldots ,m$$

$$x_tsim {mathrm{Normal}}left( {0,100^2} right)$$

Prior distributions:

$$beta _0,mu sim {mathrm{Normal}}(0,100^2)$$

$$beta _j|beta _{j – 1},sigma _beta ^2sim mathrm{Normal}left( {beta _{j – 1},sigma _beta ^2} right),;j = 2, ldots ,d$$

$$beta _1sim {mathrm{Normal}}(0,100^2)$$

$$sigma _phi ,sigma _epsilon ,sigma _beta sim {mathrm{Uniform}}(0,100)$$

$$alpha sim {mathrm{Uniform}}(0,1)$$

where d is the number of past lags included in the model, n is the total number of days of available data, m is the number of primary sludge replicates on a given day, Yt is the case count on day t, Ot is the offset on day t (number of tests performed on each day for the analysis of cases by test date or 0 otherwise), Wtj is the measured concentration of virus in sludge from sample replicate j on day t, and xt is the unobserved true concentration of virus in sludge on day t.

### Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.