Published online : 27 November 2020
Article Outline
Scroll to top
Data Release
The female urinary microbiota in relation to the reproductive tract microbiota
 Views 221
 Downloads 18
Article Review History
Cite this article as... 

Chen Chen, Lilan Hao, Weixia Wei, Fei Li, Liju Song, Xiaowei Zhang, Juanjuan Dai, Zhuye Jie, Jiandong Li, Xiaolei Song, Zirong Wang, Zhe Zhang, Liping Zeng, Hui Du, Huiru Tang, Tao Zhang, Huanming Yang, Jian Wang, Susanne Brix, Karsten Kristiansen, Xun Xu, Ruifang Wu, Huijue Jia, The female urinary microbiota in relation to the reproductive tract microbiota, Gigabyte, 1, 2020

 Copy citation

Human urine is traditionally considered to be sterile, and whether the urine harbours distinct microbial communities has been a matter of debate. Potential links between female urine and reproductive tract microbial communities is currently not clear. Here, we collected urine samples from 147 Chinese women of reproductive age and explored the nature of colonization by 16S rRNA gene amplicon sequencing, quantitative real-time PCR, and live bacteria culture. To demonstrate the utility of this approach, the intra-individual Spearman’s correlation was used to explore the relationship between urine and multiple sites of the female reproductive tract. PERMANOVA was also performed to explore potential correlations between the lifestyle and various clinical factors and urinary bacterial communities. Our data demonstrated distinct bacterial communities in urine, indicative of a non-sterile environment. Streptococcus-dominated, Lactobacillus-dominated, and diverse type were the three most common urinary bacterial community types in the cohort. Detailed comparison of the urinary microbiota with multiple sites of the female reproductive tract microbiota demonstrated that the urinary microbiota were more similar to the microbiota in the cervix and uterine cavity than to those of the vagina in the same women. Our data demonstrate the potential connectivity among microbiota in the female urogenital system and provide insight and resources for exploring diseases of the urethra and genital tract.

GigaScience Press
Sha Tin, New Territories, Hong Kong SAR
Data Description
Purpose of data acquisition
The role of microbiota in the vaginal environment has received a lot of attention over the past decade, while the female upper reproductive tract was traditionally believed to be sterile and mostly studied in the context of infections or incontinence [1]. Despite continued controversy, the presence of microorganisms beyond the cervix (i.e. the female upper reproductive tract) is increasingly recognized even in non-infectious conditions [2]. Like the female upper reproductive tract, the sterile hypothesis of urine has also been overturned by emerging evidence that indicates the existence of microorganisms in the urinary tract by culturing or sequencing approaches [3, 4]. A recent study using an expanded quantitative urine culture in combination with whole-genome sequencing has isolated and sequenced the genomes of 149 bacterial strains from catheterized urine of both symptomatic and asymptomatic peri-menopausal women [5]. It also showed highly similar strains of commensal bacteria in both the bladder and vagina of the same individual [5]. Another study analysed the urinary microbiota of 189 individuals using 16S rRNA gene amplicon sequencing and suggested that the urethra and bladder can harbour microbial communities distinct from the vagina [6]. However, the relationship between female urine microbiota and the upper reproductive tract microbiota has so far not been studied.
Here, we present a dataset of the urinary microbiota for a relatively large cohort of 147 women of reproductive age. Together with our recently published study of peritoneal fluid, uterine, and vaginal samples from the same individuals [2], this data shows that although urinary microbiota contain larger populations of Lactobacillus and Streptococcus, they are more similar to the microbiota of the cervix and uterine cavity, in accordance with the anatomical opening of the bladder. Together with a wealth of metadata, we demonstrate that these data are useful for exploring the potential of the urinary microbiota for clinical diagnosis.
A protocol collection including methods for DNA extraction, bioinformatics analysis and quantitative real-time PCR is available via (Figure 1[7].
Sample collection
In this study, a total of 147 reproductive age women (age 22–48) were recruited by Peking University Shenzhen Hospital [8]. All participants were reproductive age women who underwent hysteroscopy and/or laparoscopy for conditions without infections, such as hysteromyoma, adenomyosis, endometriosis, or salpingemphraxis. Subjects with other related diseases, such as vaginal inflammation, severe pelvic adhesion, endocrine or autoimmune disorders were removed. Pregnant women, breastfeeding women, and menstruating women at the time of sampling were also excluded. None of the subjects received any antibiotic treatments or vaginal medications within two weeks of sampling. In addition, no cervical treatment was performed within the previous 7 days, no vaginal douching was performed within 5 days, and no sexual activity was performed within at least 2 days.
137 self-sampling morning mid-stream urine samples were collected between December 2013 and July 2014 prior to the surgery (sample_metadata.csv [8]), and then stored at −80 °C until they were transported on dry ice to BGI-Shenzhen for sequencing. The samples from an additional 10 women were collected for validation purposes by a doctor during the surgery in July 2017. For each operation, a urine catheter was inserted into the disinfected urethra to collect mid-stream urine. For each sample of urine collected through a catheter, an identical volume of saline solution was set as the control sample. The samples were then placed at 4 °C, transported to BGI-Shenzhen, and processed within 6 hours. A portion of each sample was used for culturing live bacteria and the rest was used for sequencing.
Figure 1.
Protocol collection for sequencing and analysing female urinary microbiota.
DNA extraction and 16S rRNA amplicon sequencing
Genomic DNA extraction was carried out following the protocol [9]. The primers 515F and 907R were utilized for PCR amplification of the hypervariable regions V4-V5 of the bacterial 16S rRNA gene. The 907R primer includes a unique barcoded fusion. The primer sequences were: 515F: 5-GTGCCAGCMGCCGCGGTAA-3 and 907R: 5-CCGTCAATTCMTTTRAGT-3, where M denotes A or C and R denotes purine. The conditions for PCR amplification were: 3 min of denaturation at 94 °C, followed by 25 cycles of 45 s at 94 °C (denaturing), 60 s at 50 °C (annealing), and 90 s at 72 °C (elongation), followed by a final elongation for 10 min at 72 °C. The amplification products were purified by the AxyPrep™ Mag PCR Clean-Up Kit (Axygen, USA). The amplicon libraries were constructed with an Ion Plus Fragment Library Kit (Thermo Fisher Scientific Inc.) [10], then sequenced by the Ion PGM™ Sequencer with the Ion 318™ Chip v2 with a read length of 400 bp (Thermo Fisher Scientific Inc., Ion PGM™ Hi-Q™ OT2 Kit, Cat.No: A27739; Ion PGM™ Hi-Q™ Sequencing Kit, Cat.No: A25592) [11]. All experiments were performed in the laboratory of BGI-Shenzhen.
Processing of sequencing reads
The raw sequencing reads were first subjected to Mothur (Mothur, RRID:SCR_011947; V1.33.3) [12] for filtering out the low-quality reads meeting the following criteria: (1) reads shorter than 200 bp; (2) reads not matching the degenerated PCR primers for up to two errors; (3) reads with an average quality score less than 25. A total of 8,812,607 reads, with an average of 57,225 reads per sample (a minimum of 1113 reads and a maximum of 194,564 reads) were obtained. Subsequently, the sequences with identity greater than 97% were clustered into Operational Taxonomic Units (OTUs) using the QIIME (QIIME, RRID:SCR_008249; V1.8.0) uclust programme [13], where each cluster was thought of as representing a species. The seed sequences of each OTU were aligned against the Greengene reference sequences (gg_13_8_otus) for annotation using Mothur. The detailed analysis workflow was deposited in [14].
We also calculated the Unifrac distance using QIIME based on taxonomic abundance profiles at the OTU level [11].
PERMANOVA on the influence of phenotypes
Permutational multivariate analysis of variance (PERMANOVA) was used to assess the effect of different covariates based on the relative abundances of OTUs of the samples [15, 16] using Bray-Curtis and UniFrac distance and 9999 permutations from the vegan package (vegan, RRID:SCR_011950) in R [16, 17].
Quantitative real-time PCR
We quantified the four Lactobacillus species, including L.iners, L jensenii, L. crispatus and L. gasseri using the modified qPCR protocol [18]. SYBR Premix Ex Taq GC (TAKARA) was used and the reactions were run on a StepOnePlus Real-time PCR System (Life Technologies). Each PCR reaction mixture contained 10 μl of 2×SYBR Premix Ex Taq GC, 0.2 μM forward primer, 0.2 μM reverse primer, 1.6 μl of DNA sample, and 8.2 μl of ultrapure water to make up the final reaction volume of 20 μl. Each run included a standard curve and all samples were amplified in triplicate. Ultrapure water was used as the blank control template.
To construct the standard curves, the sequencing-confirmed plasmids of four species were used after quantification with a Qubit Fluorometer and serial 10-fold dilutions. The amplification efficiency was (100 ±10)% and linearity values were all ≥0.99. The detailed procedure was deposited in [19].
Bacterial culturing
The urine samples and controls from 10 additional subjects were cultivated in the laboratory by spreading 100 μl of sample on different agars containing 5% horse blood, such as PYG agar (DSMZ 104 medium), BHI agar, and EG agar. The plates were incubated in both aerobic and anaerobic conditions at 37 °C for 72 hours. To keep the medium anaerobically during culture, resazurin and cysteine-HCl were added as reducing agents. The genomic DNA of the isolates was extracted by the Bacterial DNA Kit (OMEGA) and then underwent 16S rRNA gene amplification using the universal primers 27F/1492R [20]. The amplicons were purified and sent for Sanger sequencing. The generated sequences were then submitted to BLAST on the EzBioCloud [21] for identification.
Preliminary analysis and validation
Microbiota composition of the urine
Table 1
Sequencing and annotation of the 137 samples from the exploratory cohort.
Sample nameSequencing amount% of reads annotated to taxaArchive accession number
#raw reads#clean reads#filtered reads Genus Species
C001UR52506113269347100.00%75.51% SAMEA5042945
C002UR55955145295930100.00%65.36% SAMEA5042987
C003UR613672118116831100.00%91.06% SAMEA5043040
C004UR54585185063325100.00%41.59% SAMEA5042979
C005UR521772285620683100.00%92.62% SAMEA5043003
C007UR50766154688737100.00%71.10% SAMEA5043001
C008UR53748140626169100.00%64.05% SAMEA5043004
C009UR533831224711327100.00%96.97% SAMEA5043046
C011UR478141305811292100.00%66.76% SAMEA5042941
C012UR55279169237484100.00%55.46% SAMEA5042938
C014UR55713151758818100.00%64.31% SAMEA5043060
C016UR733722205417669100.00%57.04% SAMEA5043009
C018UR691422650523581100.00%12.12% SAMEA5043006
C019UR722491786814440100.00%44.63% SAMEA5043054
C020UR54574193915452100.00%63.83% SAMEA5042942
C021UR581181729412123100.00%55.33% SAMEA5042998
C023UR474761845216795100.00%16.09% SAMEA5042947
C026UR46583167413267100.00%83.96% SAMEA5042969
C028UR882452695519268100.00%19.93% SAMEA5042984
C033UR904313199826496100.00%66.99% SAMEA5043062
C035UR637732704424115100.00%97.31% SAMEA5043037
C038UR55562101659208100.00%84.56% SAMEA5042972
C039UR779571889115748100.00%33.38% SAMEA5042963
C040UR58940125555438100.00%69.49% SAMEA5043021
C041UR60028153619366100.00%78.38% SAMEA5043000
C042UR740861440211088100.00%67.53% SAMEA5042955
C043UR741462369118730100.00%60.19% SAMEA5043032
C045UR612491780110367100.00%2.96% SAMEA5043048
C047UR47742119403506100.00%54.25% SAMEA5043024
C048UR355501100816100.00%62.75% SAMEA5042931
C050UR5156518902290100.00%72.76% SAMEA5042936
C051UR58783104038234100.00%50.77% SAMEA5042983
C053UR32311165326100.00%73.08% SAMEA5043035
C055UR45054133266184100.00%56.40% SAMEA5043016
C056UR69173246528282100.00%86.78% SAMEA5043023
C057UR644172703324444100.00%98.11% SAMEA5043059
C058UR420891415912100.00%4.28% SAMEA5042935
C059UR5364212618577100.00%74.70% SAMEA5042930
C060UR739302211019192100.00%20.17% SAMEA5043008
C062UR632201993217112100.00%79.58% SAMEA5043012
C063UR449741201790100.00%53.42% SAMEA5043039
C064UR63505150517134100.00%82.38% SAMEA5042981
C065UR538841509413794100.00%52.97% SAMEA5043027
C066UR632691615712090100.00%45.86% SAMEA5042985
C067UR55812190472481100.00%86.86% SAMEA5042986
C068UR543961745615352100.00%86.72% SAMEA5042937
T000UR57607119959166100.00%37.51% SAMEA5043045
T001UR47924134742849100.00%48.58% SAMEA5043014
T002UR63839183813623100.00%49.71% SAMEA5042975
T003UR70242191665255100.00%51.67% SAMEA5042988
T004UR67280205782947100.00%57.24% SAMEA5042943
T005UR52820128684931100.00%57.92% SAMEA5043019
T006UR794091947213710100.00%19.58% SAMEA5043017
T007UR341731403797100.00%50.06% SAMEA5042999
T008UR3007413461044100.00%84.77% SAMEA5043031
T009UR58440103867936100.00%81.92% SAMEA5042950
T010UR65382171919801100.00%47.54% SAMEA5042967
T011UR385501163464100.00%65.52% SAMEA5043061
T012UR75848219564605100.00%52.62% SAMEA5043049
T013UR268721383203100.00%74.38% SAMEA5042996
T014UR232981741518100.00%93.44% SAMEA5043042
T015UR4065320521470100.00%29.25% SAMEA5042953
T016UR58448162613993100.00%58.75% SAMEA5043053
T017UR58703192701929100.00%54.12% SAMEA5042990
T018UR54726126687152100.00%17.18% SAMEA5042970
T019UR67711141531606100.00%58.90% SAMEA5042954
T020UR899362257917919100.00%68.74% SAMEA5043052
T021UR66094147615276100.00%29.66% SAMEA5042951
T022UR28712803422100.00%74.88% SAMEA5042940
T023UR277381338385100.00%30.65% SAMEA5042961
T024UR1934582455100.00%90.91% SAMEA5042948
T025UR2973915781214100.00%76.85% SAMEA5042995
T026UR79923176066269100.00%37.09% SAMEA5042959
T027UR61145100936169100.00%59.51% SAMEA5042949
T028UR717551952916118100.00%7.61% SAMEA5042965
T029UR58776115058300100.00%6.04% SAMEA5043018
T030UR5709870005119100.00%70.23% SAMEA5042966
T031UR49283161131636100.00%38.63% SAMEA5043063
T032UR46822150411187100.00%53.24% SAMEA5042927
T033UR630441427210097100.00%80.89% SAMEA5043043
T035UR50618124031122100.00%76.20% SAMEA5043002
T036UR787812249217075100.00%84.36% SAMEA5042982
T038UR737521523711784100.00%1.14% SAMEA5043022
T039UR589042228619836100.00%95.82% SAMEA5043015
T040UR77039153328059100.00%40.76% SAMEA5043028
T041UR583821373511893100.00%97.26% SAMEA5043056
T042UR53948171121392100.00%29.74% SAMEA5043026
T043UR726621544610584100.00%16.18% SAMEA5042956
T044UR5981819779724100.00%73.48% SAMEA5042993
T045UR636272143819282100.00%39.34% SAMEA5043033
T046UR5814220606912100.00%56.69% SAMEA5042991
T047UR2419073626100.00%73.08% SAMEA5042978
T048UR1025544153100.00%79.25% SAMEA5043036
T049UR636402252020236100.00%98.73% SAMEA5043010
T051UR223221066101100.00%77.23% SAMEA5042997
T052UR57909117576419100.00%67.85% SAMEA5043058
T053UR636372457421178100.00%99.17% SAMEA5043034
T054UR591941898617148100.00%15.87% SAMEA5042929
T055UR70744149832724100.00%84.07% SAMEA5042962
T056UR5887619486846100.00%65.84% SAMEA5043051
T057UR535981388912456100.00%96.76% SAMEA5042939
T058UR1830273945100.00%64.44% SAMEA5043044
T059UR599741554212378100.00%23.53% SAMEA5042964
T060UR21736500203100.00%89.16% SAMEA5042946
T061UR330021153503100.00%56.46% SAMEA5043011
T062UR64983105196302100.00%52.19% SAMEA5042933
T063UR53347127934023100.00%40.00% SAMEA5043041
T064UR681222366521094100.00%79.99% SAMEA5042934
T065UR51210170701242100.00%65.30% SAMEA5042977
T066UR64589265321911100.00%56.04% SAMEA5042968
T067UR67938162484974100.00%50.12% SAMEA5043005
T068UR70192288903698100.00%40.37% SAMEA5043013
T069UR605642123617683100.00%43.97% SAMEA5042957
T070UR83453207555034100.00%44.18% SAMEA5043038
T071UR800773677029224100.00%97.68% SAMEA5043007
T072UR734692967119787100.00%87.68% SAMEA5042992
T073UR73167175773914100.00%54.09% SAMEA5042989
T074UR590842390621347100.00%89.11% SAMEA5042973
T075UR602631772615250100.00%70.37% SAMEA5042976
T076UR37428809514100.00%63.42% SAMEA5042960
T078UR76834170344220100.00%66.75% SAMEA5042958
T080UR1217260961100.00%52.46% SAMEA5042932
T081UR63432148416915100.00%86.49% SAMEA5043029
T082UR26941693609100.00%2.96% SAMEA5043047
T083UR691493430730270100.00%95.45% SAMEA5043055
T084UR593042586322866100.00%89.34% SAMEA5042928
T085UR65565204261344100.00%78.57% SAMEA5043030
T086UR666052382821243100.00%95.16% SAMEA5043025
T087UR62480166566414100.00%76.55% SAMEA5043057
T088UR82733325383794100.00%71.09% SAMEA5042994
T089UR1102272722311761100.00%24.49% SAMEA5042944
T090UR70526292961917100.00%71.62% SAMEA5043020
T091UR27973913739100.00%27.74% SAMEA5042952
T092UR69694108257894100.00%55.26% SAMEA5043050
T093UR58492146567272100.00%84.27% SAMEA5042980
T094UR1945645926835224100.00%37.26% SAMEA5042971
T095UR426811009560100.00%45.00% SAMEA5042974
To explore the urinary microbiota in this dataset, morning midstream urine (UR) was self-collected prior to surgery from an exploratory cohort of 137 Chinese women recruited for the study (median age 31.6, range 22–48). As with our previous vagino-uterine microbiota study [2], all volunteers had conditions that were not known to involve infections [8]. From 95 women in the cohort, six locations within the female reproductive tract, including the lower third of the vagina (CL), the posterior fornix (CU), cervical mucus (CV), endometrium (ET), left and right fallopian tubes (FLL and FRL), and peritoneal fluid (PF) were also sampled. Their vagino-uterine microbiota information have been published previously [2]. After 16S rRNA gene amplicon sequencing, the sequencing reads were pre-processed for quality control and filtering, then clustered into OTUs (Methods, Table 1 and OTU_table_urine.biom.hdf5 [8]).
Due to anatomical structures, voided urine samples from women were considered to be easily contaminated by microbiota from the surrounding vulvovaginal region [22]. Most vaginal communities (88%) in this cohort were dominated by one genus with >50% relative abundance within data from individuals. In contrast, the urinary microbiota in this study showed more heterogeneity. 56.93% of the cohort harboured a diverse type represented significantly by bacteria, including Streptococcus, Lactobacillus, Pseudomonas, Staphylococcus, Acinetobacter, and Vagococcus, though none of these species were dominant, i.e. reached >50% relative abundance (Figure 2). In addition, 22.63% of the women harboured >50% Streptococcus, and 13.87% of the women harboured >50% Lactobacillus (Figure 2A, B). Rare subtypes such as Enterococcus (2.19%), Bifidobacteriaceae (1.46%), Prevotella (0.73%), Enterobacteriaceae (0.73%), Coriobacteriaceae (0.73%), and Veillonella (0.73%) were also detected in this cohort (Figure 2A, B). Notably, the median relative abundances of Lactobacillus, Pseudomonas, and Acinetobacter in the urine samples were more similar to the uterus samples (Figure 2C) [2]. At the phylum level, urinary microbiota were dominated by Firmicutes and Proteobacteria (Figure 2C).
Figure 2.
Urinary microbiota of the initial cohort of 137 Chinese reproductive-age women. (A) The relative abundances of genera detected in each individual are shown in the bar chart. The dendrogram is a result of a centroid linkage hierarchical clustering based on Euclidean distances between the microbial composition proportion of urinary bacterial communities. (B) The ratio of different urinary microbiota types. The genus whose relative abundance accounted for >50% in an individual was selected as an identified type. The genera that accounted for <50% of the microbiota in an individual were identified as diverse type. (C) Pie chart for the urinary microbial genera according to their median relative abundance. Genera that took up less than 1% of the microbiota are labelled together as ‘others’. The outer ring indicates the distribution of microbiota at the phylum level.
Cultivation of live bacteria from transurethral catheterized urine
The question of whether bacterial DNA signals have originated from live bacteria or fragments in the urine samples has been a subject of much debate [22]. To demonstrate the utility of the data for addressing this question, we performed a validation study using live bacteria cultures from urine samples provided by an additional cohort of 10 women.
We tried to culture and isolate bacterial colonies from freshly collected urine samples. Urine samples were serial diluted and spread on three different kinds of agar plates and incubated under both aerobic and anaerobic conditions. Six different positive isolates belonging to 5 genera, including Lactobacillus, Staphylococcus, Clostridium, Enterococcus, and Propionibacterium were obtained from 3 out of 10 subjects (Table 2). The 5 genera were also found as dominant in our 16S rRNA gene amplicon sequencing data and consistent with previous cultivation results of published papers [2326] (Table 2). Reassuringly, no isolates were detected from the negative controls (sterile saline and ultrapure water). Therefore, these data verified the existence of live bacteria in the urine by obtaining isolates using conventional culturing methods.
Table 2
Identification of cultured microbial isolates from urine of the 10 additional women by sequencing of partial 16S rRNA gene.
Sample ID ConditionMedium16S rRNA gene-PCR IdentificationAccessionsIdentity (%)Supported by previous cultivation
S001UAnaerobic, 37 °CEGClostridium cochlearium LR761333.199.26Meijer-Severs et al. [24]
S001UAnaerobic, 37 °C104Streptococcus sp. (S. tigurinus/S. mitis) LR761334.199.72Hilt et al. [23]
S003UAnaerobic, 37 °CBHIEnterococcus faecalis LR761335.199.91Hilt et al. [23], Guzmàn et al. [25], Fraimow et al. [26],
S003UAnaerobic, 37 °C104Lactobacillus crispatus LR761337.199.82Hilt et al. [23]
S003UAnaerobic, 37 °C104Propionibacterium granulosum LR761336.199.02Ormerod et al. [27]
S008UAnaerobic, 37 °C104, BHI, EGStreptococcus agalactiae LR761340.1, LR761339.1, LR761338.199.65, 99.35, 99.52Hilt et al. [23]
Considerable bacterial biomass revealed by qPCR
To provide additional evidence of the bacterial communities in the urine, a species-specific quantitative real-time PCR method was utilized to focus on the four common vaginal Lactobacillus species, i.e. L. crispatus, L. iners, L. jensenii and L. gasseri (QPCR Lactobacillus.csv [8]). The Lactobacillus species we examined presented a similar distribution and abundance along the female reproductive tract, and the corresponding urinary Lactobacillus ranged between the upper and lower reproductive tracts (Figure 3A). Among them, L. iners occurred most frequently (59%) in the urine samples, while L. crispatus only occurred in 26% of women sampled (Figure 3B). L. iners was reported far less protective against bacterial and viral infections compared to L. crispatus [28]. 80% of the cohort was detected to harbour at least one of these four Lactobacillus species (Figure 3B). The occurrence rate of Lactobacillus in the genus level of 16S rRNA gene amplicon sequencing data was 94% (Figure 2A). The total bacterial biomass is approximated by the ratio of the copy number from the result of qPCR to the relative abundance according to the result of 16S rRNA gene sequencing of the same sample (QPCR bacterial_biomass.csv [8]). The result gave an estimation of 107 copies/sample, placing the urinary bacterial biomass between the vaginal-cervical sites (1010–1011 copies/sample) and the endometrium (ET) samples (106–107 copies/sample) [2] (Figure 3A), all of which were orders of magnitude above potential background noise [29]. These results were interestingly consistent with a weakly acidic pH of the urine, in comparison to pH < 4.5 in the vagina or pH ∼ 8 in the peritoneal fluid [30].
Figure 3.
The concentrations of the dominant Lactobacillus species at urine and the reproductive tract. Samples derive from the initial cohort of 137 Chinese reproductive-age women. (A) The abundance of L. iners, L. jensenii, L. crispatus and L. gasseri calculated by qPCR results in different samples. Boxes denote the interquartile range (IQR) between the first and third quartiles (25th and 75th percentiles, respectively), and the lines inside the boxes denote the median. The whiskers denote the lowest and highest values within 1.5 times the IQR from the first and third quartiles, respectively. (B) The frequency of the respective Lactobacillus detected in all urine sample.
Intra-individual similarity in the urine-reproductive tract microbiota
To further assess the microbiota relationship between the urine and the six positions of the female reproductive tract, we computed intra-individual correlation between the microbial profiles in the urine and those found in different sites of the reproductive tract, and then clustered the individuals into 4 groups (Spearman’s correlation coefficient, Figure 4A, relative_abundance_correlation.csv [8]). Interestingly, the microbiota of group 3, which accounted for 41% of the cohort, showed significant correlation between the urine samples and the female reproductive tract samples, of which the coefficient increased gradually along the anatomical site from CL to CV, ET, and PF (Figure 4B). In contrast, 9% of women in group 1 presented a reverse trend. In group 2 (22%) and group 4 (27%), there appeared to be a weak relationship between the microbiota of the urine and female reproductive tract. Taken together, we observed the most similar distribution of microbiota between urine and CV/ET (Figure 4A). The principal coordinate analyses (PCoA) of the weighted and unweighted intra-individual UniFrac distance further corroborated our conclusion that there is an intra-individual similarity of the microbiota between the urine and the upper sites of female reproductive tract, especially the junction sites (CV and ET) (Figure 5).
Figure 4.
Similarity of the urine-reproductive tract microbiota within individuals. (A) Heatmap for the intra-individual Spearman’s correlation coefficient between microbiota identified in the urine and at different sites in the reproductive tract (relative_abundance_correlation.csv [8]). Samples derived from the initial cohort of 95 Chinese reproductive-age women, who collected both the urine and reproductive tract samples. As the number of samples from fallopian tubes (FLL, FRL) is too small, the correlation between microbiota in the urine and those in fallopian tubes are not shown. The dendrogram is a result of a centroid linkage hierarchical clustering based on Euclidean distances between the intra-individual Spearman’s correlation coefficient of different body sites. The colored squares illustrate the subtypes found within the urinary microbiome. (B) Spearman’s correlation coefficient between microbiota found in the urine and those from different sites of the reproductive tract. The Wilcoxon ranked sum test was used to calculate the difference. Boxes denote the interquartile range (IQR) between the first and third quartiles (25th and 75th percentiles, respectively), and the line inside the boxes denote the median. The whiskers denote the lowest and highest values within 1.5 times the IQR from the first and third quartiles, respectively. An asterisk denotes p <0.05, two asterisks denote p <0.01, three asterisks denote p <0.001.
Figure 5.
PCoA on the samples based on Unweighted-UniFrac (A) and Weighted-UniFrac (B) distances. Samples were taken from UR, CL, CU, and CV before operation, and from ET and PF during operation. Samples were derived from the initial cohort of 137 Chinese reproductive-age women. Each dot represents one sample (n =94 CL, 95 CU, 95 CV, 80 ET, 93 PF, 9 FLL, 10 FRL, and 137 UR).
Lifestyle and clinical factors influencing the urinary microbiota
The human microbiome is dynamic and highly affected by its host environment. Age, menstrual cycle, benign conditions such as adenomyosis, and infertility due to endometriosis have previously been reported to shape the vagino-uterine microbiota [2]. With our comprehensive collection of demographic and baseline clinical characteristics from women of reproductive age (sample_metadata.csv [8]), such variations in the urinary microbiota can be explored in this dataset. Urinary microbial composition was significantly associated with these factors, such as age, surgical history, abortion, vaginal deliveries, experience of given birth (multipara vs. nullipara), infertility due to endometriosis, and hysteromyoma (PERMANOVA, P <0.05, q <0.05, Table 3). Although the urinary microbiota also correlated with some other factors, such as menstrual phase, contraception, endometriosis, pelvic adhesiolysis, and anemia, statistical significance was not achieved after controlling for multiple testing (PERMANOVA, P <0.05 but q >0.05, Table 3). The initial results here indicate a close link between the urinary microbiota and the general and diseased physiological conditions, and this link could be further understood by exploring this data more deeply.
Table 3
PERMANOVA for the influence of phenotypes on the urinary microbiota.
R2P valueFdrR2P valueFdrR2P valueFdr
Age-2 groups0.0130.0500.2360.0110.1160.4290.0160.0420.452
Age-3 groups0.0320.0260.1500.0250.2280.5770.0300.1350.539
Frequent colds0.0110.0800.2700.0110.1080.4290.0210.0170.419
Surgical history0.0180.0050.0490.0180.0060.1720.0340.0010.091
Abdominal surgical history0.0100.1870.4180.0070.4660.7550.0190.0300.419
Menstrual cycle0.0090.2000.4210.0180.0050.1720.0150.0650.455
Menstrual phase (lower)0.0180.2600.4680.0240.0480.4080.0200.2070.623
Menstrual phase (upper)0.0180.0060.0560.0180.0090.1720.0140.0960.456
Vaginal deliveries0.0180.0030.0490.0160.0140.1720.0160.0510.455
Multipara / nullipara0.0190.0030.0490.0170.0140.1720.0130.0910.456
Infertility due to endometriosis0.0450.0000.0080.0290.0130.1720.0190.1810.599
Pelvic adhesiolysis0.0080.3460.5720.0130.0420.4000.0060.4890.782
Potential uses
As a large-scale cohort for studying the female urinary microbiota, our data provide a useful baseline and reference dataset in women of reproductive age. We also explored the association between the composition of urinary microbiota and that of the female reproductive tract microbiota. It is valuable to note that a higher intra-individual compositional similarity was observed between the microbiota of the urine and those of the cervical canal/uterus than between the microbiota of the urine and those of the vagina. This finding indicates that sampling of midstream urine (the least invasive and the easiest way) could be potentially used to survey the micro-environment of the cervical canal and uterus in the general population. This is relevant to the demonstrated associations between the urinary microbiota and various uterine-related diseases, such as hysteromyoma and infertility due to endometriosis. Our data provide a reference for clinical diagnosis and warrants further detailed exploration.
There are three limitations for this study. Firstly, as it was not possible to directly sample the upper reproductive tract of perfectly healthy women, we have included women who underwent minimally invasive laparoscopy or laparotomy for conditions that are not known to involve infection. This was the best proxy for sampling the upper reproductive tract in healthy women. Nevertheless, the relevance of the urinary microbiota between healthy women and women in our cohort would require further comparison. Secondly, for the low bacterial biomass of urine samples, a more comprehensive sampling process should be taken into consideration in subsequent studies, such as disinfection of the urethra and vulvovaginal region with 75% alcohol before urine self-collection, including a sample of sterile saline with the self-collection kit as a negative control and asking participants to fill another vial with it immediately following urine collection. A comparison of the microbial composition between the catheter-collected and self-collected specimens in the same individual would also require further inspection. Together, we hope that this dataset helps promote a new round of accelerated discoveries, including a novel scientific explanation for uterine-related diseases via longitudinal studies on the microbiota of the urinary and reproductive tracts.
Ethics approval and consent to participate
The study was approved by the Institutional Review Board of BGI-Shenzhen (No. BGI-IRB 17219) and Peking University Shenzhen Hospital (Version 1.0.20140301). All participants gave written informed consent prior to their recruitment into the study.
Availability of supporting data
The sequence reads generated by 16S rRNA gene amplicon sequencing have been deposited in both the European Nucleotide Archive with the accession number PRJEB29341 and the CNSA ( of CNGB database with accession code CNP0000166. Additional data, result and a STORMS (Strengthening The Organizing and Reporting of Microbiome Studies) checklist are available from the GigaScience GigaDB repository [8]. The sequences of bacterial isolates have been deposited in the European Nucleotide Archive with the accession number PRJEB36743.
Author contributions
H.J. and R.W. organized this study. W.W., J.D., H.D., L.Z., H.T., T.W., and R.W. performed the sample collection, and phenotypic information collection. F.L., L.S., C.C., and J.L. performed the molecular biology experiments. C.C., L.H., and F.L. performed the bioinformatic analyses. C.C., X.Z., F.L., and H.J., wrote the manuscript.
Competing interests
There were no competing financial interests.
The study was supported by the Shenzhen Municipal Government (No. SZXK027 and No. SZSM202011016), Shenzhen Peacock Plan (No. KQTD20150330171505310), and the Medical Scientific Research Foundation of Guangdong (No. A2019035). The authors really appreciate colleagues at BGI-Shenzhen for DNA extraction, library construction, and sequencing.
1.WhitesideSa, RazviH, DaveS, ReidG, BurtonJP, The microbiome of the urinary tract — A role beyond infection. Nat. Rev. Urol., 2015; 8190.
2.ChenC, SongX, WeiW, ZhongH, DaiJ, LanZ The microbiota continuum along the female reproductive tract and its relation to uterine-related diseases. Nat. Commun., 2017; 8: 875. doi:10.1038/s41467-017-00901-0.
3.SiddiquiH, NederbragtAJ, LagesenK, JeanssonSL, JakobsenKS., Assessing diversity of the female urine microbiota by high throughput sequencing of 16S rDNA amplicons. BMC Microbiol., 2011; 11: 244.
4.WolfeAJ, TohE, ShibataN, RongR, KentonK, FitzGeraldMP Evidence of uncultivated bacteria in the adult female bladder. J. Clin. Microbiol., 2012; 50: 13761383.
5.Thomas-WhiteK, ForsterSC, KumarN, Van KuikenM, PutontiC, StaresMD Culturing of female bladder bacteria reveals an interconnected urogenital microbiota. Nat Commun., 2018; 9: 1557.
6.GottschickC, DengZL, VitalM, MasurC, AbelsC, PieperDH The urinary microbiota of men and women and its changes in women during bacterial vaginosis and antibiotic treatment. Microbiome, 2017; 5: 99.
7.HaoL, Protocols for “The female urinary microbiota in relation to the reproductive tract microbiota”. 2020;
8.ChenC, HaoL, WeiW, LiF, SongX, ZhangX Data from the female urinary microbiota. 2020, GigaScience Database;
9.HaoL, DNA extraction for human microbe samples. 2020;
10.Prepare Amplicon Libraries without Fragmentation Using the Ion Plus Fragment Library Kit.
11.Ion Personal Genome Machine™ (PGM™) System REFERENCE GUIDE.
12.SchlossPD, WestcottSL, RyabinT, HallJR, HartmannM, HollisterEB Introducing mothur: Open-source, platform-independent, community-supported software for describing and comparing microbial communities. Appl. Environ. Microbiol., 2009; 75: 75377541.
13.CaporasoJG, KuczynskiJ, StombaughJ, BittingerK, BushmanFD, CostelloEK QIIME allows analysis of high-throughput community sequencing data. Nat. Methods, 2010; 7: 335336.
14.HaoL, A Bioinformatics Analysis workflow for 16S rRNA Amplicon Sequencing data. 2020;
15.AndersonMJ, A new method for non-parametric multivariate analysis of variance. Aust. Ecol., 2001; 26: 3246.
16.FengQ, LiangS, JiaH, StadlmayrA, TangL, LanZ Gut microbiome development along the colorectal adenoma-carcinoma sequence. Nat. Commun., 2015; 6: 6528.
17.ZapalaMA, SchorkNJ, Multivariate regression analysis of distance matrices for testing associations between gene expression patterns and related variables. Proc. Natl Acad. Sci. USA, 2006; 103: 1943019435.
18.De BackerE, VerhelstR, VerstraelenH, AlqumberM a, BurtonJP, TaggJR Quantitative determination by real-time PCR of four vaginal Lactobacillus species, Gardnerella vaginalis and Atopobium vaginae indicates an inverse relationship between L. gasseri and L. iners. BMC Microbiol., 2007; 7: 115.
19.ChenC, Quantitative real-time PCR for the four Lactobacillus species. 2020;
20.AugustinosAA, KyritsisGA, PapadopoulosNT, Abd-AllaAMM, CáceresC, BourtzisK, Exploitation of the medfly gut microbiota for the enhancement of sterile insect technique: Use of Enterobacter sp. in larval diet-based probiotic applications. PLoS One, 2015; 10: 117.
22.KarstensL, AsquithM, CarusoV, RosenbaumJT, FairDA, BraunJ Community profiling of the urinary microbiota: considerations for low-biomass samples. Nat. Rev. Urol., 2018; 15: 735749.
23.HiltEE, McKinleyK, PearceMM, RosenfeldAB, ZillioxMJ, MuellerER Urine is not sterile: Use of enhanced urine culture techniques to detect resident bacterial flora in the adult female bladder. J. Clin. Microbiol., 2014; 52: 871876.
24.Meijer-SeversGJ, AarnoudseJG, MensinkWFA, DankertJ., The presence of antibody-coated anaerobic bacteria in asymptomatic bacteriuria during pregnancy. J. Infect. Dis., 1979; 140: 653658.
25.GuzmànCA, PruzzoC, LiPiraG, CalegariL, Role of adherence in pathogenesis of Enterococcus faecalis urinary tract infection and endocarditis. Infect. Immun., 1989; 57: 18341838.
26.FraimowHS, JungkindDL, LanderDW, DelsoDR, DeanJL, Urinary tract infection with an Enterococcus faecalis isolate that requires vancomycin for growth. Ann. Intern. Med., 1994; 121: 2226.
27.OrmerodAD, PetersenJ, HusseyJK Immune complex glomerulonephritis and chronic anaerobic urinary infection–complications of filariasis. Postgrad. Med. J., 1983; 59: 730733.
28.PetricevicL, DomigKJ, NierscherFJ, SandhoferMJ, FidesserM, KrondorferI Characterisation of the vaginal Lactobacillus microbiota associated with preterm delivery. 2014;1–6.
29.SalterSJ, CoxMJ, TurekEM, CalusST, CooksonWO, MoffattMF Reagent and laboratory contamination can critically impact sequence-based microbiome analyses. BMC Biol., 2014; 12: 87.
30.O’HanlonDE, ComeRA, MoenchTR, Vaginal pH measured in vivo: lactobacilli determine pH and lactic acid concentration. BMC Microbiol., 2019; 19: 13.