The female urinary microbiota in relation to the reproductive tract microbiota

Human urine is traditionally considered to be sterile, and whether the urine harbours distinct microbial communities has been a matter of debate. Potential links between female urine and reproductive tract microbial communities is currently not clear. Here, we collected urine samples from 147 Chinese women of reproductive age and explored the nature of colonization by 16S rRNA gene amplicon sequencing, quantitative real-time PCR, and live bacteria culture. To demonstrate the utility of this approach, the intra-individual Spearman’s correlation was used to explore the relationship between urine and multiple sites of the female reproductive tract. PERMANOVA was also performed to explore potential correlations between the lifestyle and various clinical factors and urinary bacterial communities. Our data demonstrated distinct bacterial communities in urine, indicative of a non-sterile environment. Streptococcus-dominated, Lactobacillus-dominated, and diverse type were the three most common urinary bacterial community types in the cohort. Detailed comparison of the urinary microbiota with multiple sites of the female reproductive tract microbiota demonstrated that the urinary microbiota were more similar to the microbiota in the cervix and uterine cavity than to those of the vagina in the same women. Our data demonstrate the potential connectivity among microbiota in the female urogenital system and provide insight and resources for exploring diseases of the urethra and genital tract.

. Protocol collection for sequencing and analysing female urinary microbiota. https://www.protocols.io/ widgets/doi?uri=dx.doi.org/10.17504/protocols.io.bp3wmqpe from an additional 10 women were collected for validation purposes by a doctor during the surgery in July 2017. For each operation, a urine catheter was inserted into the disinfected urethra to collect mid-stream urine. For each sample of urine collected through a catheter, an identical volume of saline solution was set as the control sample. The samples were then placed at 4 °C, transported to BGI-Shenzhen, and processed within 6 hours. A portion of each sample was used for culturing live bacteria and the rest was used for sequencing.

DNA extraction and 16S rRNA amplicon sequencing
Genomic DNA extraction was carried out following the protocol [9]. The primers 515F and 907R were utilized for PCR amplification of the hypervariable regions V4-V5 of the bacterial 16S rRNA gene. The 907R primer includes a unique barcoded fusion. The primer sequences were: 515F: 5 ′ -GTGCCAGCMGCCGCGGTAA-3 ′ and 907R: 5 ′ -CCGTCAATTCMTTTRAGT-3 ′ , where M denotes A or C and R denotes purine. The conditions for PCR amplification were: 3 min of denaturation at 94°C, followed by 25 cycles of 45 s at 94°C (denaturing), 60 s at 50 °C (annealing), and 90 s at 72 °C (elongation), followed by a final elongation for 10 min at 72°C. The amplification products were purified by the AxyPrep™ Mag PCR Clean-Up Kit (Axygen, USA). The amplicon libraries were constructed with an Ion Plus Fragment Library Kit (Thermo Fisher Scientific Inc.) [10], then sequenced by the Ion PGM™ Sequencer with the Ion 318™ Chip v2 with a read length of 400 bp (Thermo Fisher Scientific Inc., Ion PGM™ Hi-Q™ OT2 Kit, Cat.No: A27739; Ion PGM™ Hi-Q™ Sequencing Kit, Cat.No: A25592) [11]. All experiments were performed in the laboratory of BGI-Shenzhen.

Processing of sequencing reads
The raw sequencing reads were first subjected to Mothur (Mothur, RRID:SCR_011947; V1.33.3) [12] for filtering out the low-quality reads meeting the following criteria: (1) reads shorter than 200 bp; (2) reads not matching the degenerated PCR primers for up to two errors; (3) reads with an average quality score less than 25. A total of 8,812,607 reads, with an average of 57,225 reads per sample (a minimum of 1113 reads and a maximum of 194,564 reads) were obtained. Subsequently, the sequences with identity greater than 97% were clustered into Operational Taxonomic Units (OTUs) using the QIIME (QIIME, RRID:SCR_008249; V1.8.0) uclust programme [13], where each cluster was thought of as representing a species. The seed sequences of each OTU were aligned against the Greengene reference sequences (gg_13_8_otus) for annotation using Mothur. The detailed analysis workflow was deposited in protocols.io [14].
We also calculated the Unifrac distance using QIIME based on taxonomic abundance profiles at the OTU level [11].

PERMANOVA on the influence of phenotypes
Permutational multivariate analysis of variance (PERMANOVA) was used to assess the effect of different covariates based on the relative abundances of OTUs of the samples [15,16] using Bray-Curtis and UniFrac distance and 9999 permutations from the vegan package (vegan, RRID:SCR_011950) in R [16,17].

Quantitative real-time PCR
We quantified the four Lactobacillus species, including L.iners, L jensenii, L. crispatus and L. gasseri using the modified qPCR protocol [18]. SYBR Premix Ex Taq GC (TAKARA) was used and the reactions were run on a StepOnePlus Real-time PCR System (Life Technologies). Each PCR reaction mixture contained 10 μl of 2×SYBR Premix Ex Taq GC, 0.2 μM forward primer, 0.2 μM reverse primer, 1.6 μl of DNA sample, and 8.2 μl of ultrapure water to make up the final reaction volume of 20 μl. Each run included a standard curve and all samples were amplified in triplicate. Ultrapure water was used as the blank control template.
To construct the standard curves, the sequencing-confirmed plasmids of four species were used after quantification with a Qubit Fluorometer and serial 10-fold dilutions. The amplification efficiency was (100 ±10)% and linearity values were all ≥0.99. The detailed procedure was deposited in protocols.io [19].

Bacterial culturing
The urine samples and controls from 10 additional subjects were cultivated in the laboratory by spreading 100 μl of sample on different agars containing 5% horse blood, such as PYG agar (DSMZ 104 medium), BHI agar, and EG agar. The plates were incubated in both aerobic and anaerobic conditions at 37°C for 72 hours. To keep the medium anaerobically during culture, resazurin and cysteine-HCl were added as reducing agents. The genomic DNA of the isolates was extracted by the Bacterial DNA Kit (OMEGA) and then underwent 16S rRNA gene amplification using the universal primers 27F/1492R [20]. The amplicons were purified and sent for Sanger sequencing. The generated sequences were then submitted to BLAST on the EzBioCloud [21] for identification.

PRELIMINARY ANALYSIS AND VALIDATION Microbiota composition of the urine
To explore the urinary microbiota in this dataset, morning midstream urine (UR) was self-collected prior to surgery from an exploratory cohort of 137 Chinese women recruited for the study (median age 31.6, range 22-48). As with our previous vagino-uterine microbiota study [2], all volunteers had conditions that were not known to involve infections [8]. From 95 women in the cohort, six locations within the female reproductive tract, including the lower third of the vagina (CL), the posterior fornix (CU), cervical mucus (CV), endometrium (ET), left and right fallopian tubes (FLL and FRL), and peritoneal fluid (PF) were also sampled. Their vagino-uterine microbiota information have been published previously [2]. After 16S rRNA gene amplicon sequencing, the sequencing reads were pre-processed for quality control and filtering, then clustered into OTUs (Methods, Table 1 and OTU_table_urine.biom.hdf5 [8]).
Due to anatomical structures, voided urine samples from women were considered to be easily contaminated by microbiota from the surrounding vulvovaginal region [22]. Most vaginal communities (88%) in this cohort were dominated by one genus with >50% relative abundance within data from individuals. In contrast, the urinary microbiota in this study showed more heterogeneity. 56.93% of the cohort harboured a diverse type represented significantly by bacteria, including Streptococcus, Lactobacillus, Pseudomonas, Staphylococcus, Acinetobacter, and Vagococcus, though none of these species were dominant, i.e. reached >50% relative abundance ( Figure 2). In addition, 22.63% of the women harboured >50% Streptococcus, and 13.87% of the women harboured >50% Lactobacillus (Figure 2A, B). Rare subtypes such as Enterococcus (2.19%), Bifidobacteriaceae Veillonella (0.73%) were also detected in this cohort (Figure 2A, B). Notably, the median relative abundances of Lactobacillus, Pseudomonas, and Acinetobacter in the urine samples were more similar to the uterus samples ( Figure 2C) [2]. At the phylum level, urinary microbiota were dominated by Firmicutes and Proteobacteria ( Figure 2C). The ratio of different urinary microbiota types. The genus whose relative abundance accounted for >50% in an individual was selected as an identified type. The genera that accounted for <50% of the microbiota in an individual were identified as diverse type. (C) Pie chart for the urinary microbial genera according to their median relative abundance. Genera that took up less than 1% of the microbiota are labelled together as 'others'. The outer ring indicates the distribution of microbiota at the phylum level.

Cultivation of live bacteria from transurethral catheterized urine
The question of whether bacterial DNA signals have originated from live bacteria or fragments in the urine samples has been a subject of much debate [22]. To demonstrate the utility of the data for addressing this question, we performed a validation study using live bacteria cultures from urine samples provided by an additional cohort of 10 women.
Gigabyte, 2020, DOI: 10.46471/gigabyte.9 We tried to culture and isolate bacterial colonies from freshly collected urine samples.

8/15
Urine samples were serial diluted and spread on three different kinds of agar plates and incubated under both aerobic and anaerobic conditions. Six different positive isolates belonging to 5 genera, including Lactobacillus, Staphylococcus, Clostridium, Enterococcus, and Propionibacterium were obtained from 3 out of 10 subjects ( Table 2). The 5 genera were also found as dominant in our 16S rRNA gene amplicon sequencing data and consistent with previous cultivation results of published papers [23][24][25][26] (Table 2). Reassuringly, no isolates were detected from the negative controls (sterile saline and ultrapure water).
Therefore, these data verified the existence of live bacteria in the urine by obtaining isolates using conventional culturing methods.

Considerable bacterial biomass revealed by qPCR
To provide additional evidence of the bacterial communities in the urine, a species-specific quantitative real-time PCR method was utilized to focus on the four common vaginal Lactobacillus species, i.e. L. crispatus, L. iners, L. jensenii and L. gasseri (QPCR Lactobacillus.csv [8]). The Lactobacillus species we examined presented a similar distribution and abundance along the female reproductive tract, and the corresponding urinary Lactobacillus ranged between the upper and lower reproductive tracts ( Figure 3A).
Among them, L. iners occurred most frequently (59%) in the urine samples, while L.
crispatus only occurred in 26% of women sampled ( Figure 3B). L. iners was reported far less protective against bacterial and viral infections compared to L. crispatus [28]. 80% of the cohort was detected to harbour at least one of these four Lactobacillus species ( Figure 3B).
The occurrence rate of Lactobacillus in the genus level of 16S rRNA gene amplicon sequencing data was 94% (Figure 2A). The total bacterial biomass is approximated by the ratio of the copy number from the result of qPCR to the relative abundance according to the result of 16S rRNA gene sequencing of the same sample (QPCR bacterial_biomass.csv [8]).

Intra-individual similarity in the urine-reproductive tract microbiota
To further assess the microbiota relationship between the urine and the six positions of the female reproductive tract, we computed intra-individual correlation between the microbial profiles in the urine and those found in different sites of the reproductive tract, and then clustered the individuals into 4 groups (Spearman's correlation coefficient, Figure 4A, relative_abundance_correlation.csv [8]). Interestingly, the microbiota of group 3, which accounted for 41% of the cohort, showed significant correlation between the urine samples and the female reproductive tract samples, of which the coefficient increased gradually along the anatomical site from CL to CV, ET, and PF ( Figure 4B). In contrast, 9% of women in group 1 presented a reverse trend. In group 2 (22%) and group 4 (27%), there appeared to be a weak relationship between the microbiota of the urine and female reproductive tract.
Taken together, we observed the most similar distribution of microbiota between urine and CV/ET ( Figure 4A). The principal coordinate analyses (PCoA) of the weighted and unweighted intra-individual UniFrac distance further corroborated our conclusion that there is an intra-individual similarity of the microbiota between the urine and the upper sites of female reproductive tract, especially the junction sites (CV and ET) ( Figure 5).

Lifestyle and clinical factors influencing the urinary microbiota
The human microbiome is dynamic and highly affected by its host environment. Age, menstrual cycle, benign conditions such as adenomyosis, and infertility due to endometriosis have previously been reported to shape the vagino-uterine microbiota [2].
With our comprehensive collection of demographic and baseline clinical characteristics from women of reproductive age (sample_metadata.csv [8]), such variations in the urinary microbiota can be explored in this dataset. Urinary microbial composition was significantly associated with these factors, such as age, surgical history, abortion, vaginal deliveries, experience of given birth (multipara vs. nullipara), infertility due to endometriosis, and hysteromyoma (PERMANOVA, P <0.05, q <0.05, Table 3). Although the urinary microbiota also correlated with some other factors, such as menstrual phase,   Table 3). The initial results here indicate a close link between the urinary microbiota and the general and diseased physiological conditions, and this link could be further understood by exploring this data more deeply.

Potential uses
As a large-scale cohort for studying the female urinary microbiota, our data provide a useful baseline and reference dataset in women of reproductive age. We also explored the association between the composition of urinary microbiota and that of the female reproductive tract microbiota. It is valuable to note that a higher intra-individual Gigabyte, 2020, DOI: 10.46471/gigabyte.9 compositional similarity was observed between the microbiota of the urine and those of the cervical canal/uterus than between the microbiota of the urine and those of the vagina. This finding indicates that sampling of midstream urine (the least invasive and the easiest way) could be potentially used to survey the micro-environment of the cervical canal and uterus in the general population. This is relevant to the demonstrated associations between the urinary microbiota and various uterine-related diseases, such as hysteromyoma and infertility due to endometriosis. Our data provide a reference for clinical diagnosis and warrants further detailed exploration.

12/15
There are three limitations for this study. Firstly, as it was not possible to directly sample the upper reproductive tract of perfectly healthy women, we have included women who underwent minimally invasive laparoscopy or laparotomy for conditions that are not known to involve infection. This was the best proxy for sampling the upper reproductive tract in healthy women. Nevertheless, the relevance of the urinary microbiota between healthy women and women in our cohort would require further comparison. Secondly, for the low bacterial biomass of urine samples, a more comprehensive sampling process should be taken into consideration in subsequent studies, such as disinfection of the urethra and vulvovaginal region with 75% alcohol before urine self-collection, including a sample of sterile saline with the self-collection kit as a negative control and asking participants to fill another vial with it immediately following urine collection. A comparison of the microbial composition between the catheter-collected and self-collected specimens in the same individual would also require further inspection. Together, we hope that this dataset helps promote a new round of accelerated discoveries, including a novel scientific explanation for uterine-related diseases via longitudinal studies on the microbiota of the urinary and reproductive tracts.

DECLARATIONS ETHICS APPROVAL AND CONSENT TO PARTICIPATE
The study was approved by the Institutional Review Board of BGI-Shenzhen (No. BGI-IRB 17219) and Peking University Shenzhen Hospital (Version 1.0.20140301). All participants gave written informed consent prior to their recruitment into the study.

DATA AVAILABILITY
The sequence reads generated by 16S rRNA gene amplicon sequencing have been deposited in both the European Nucleotide Archive with the accession number PRJEB29341 and the CNSA (https://db.cngb.org/cnsa/) of CNGB database with accession code CNP0000166. Additional data, result and a STORMS (Strengthening The Organizing and Reporting of Microbiome Studies) checklist are available from the GigaScience GigaDB repository [8]. The sequences of bacterial isolates have been deposited in the European Nucleotide Archive with the accession number PRJEB36743.