HomeScienceDrivers and determinants of strain dynamics following fecal microbiota transplantation

# Drivers and determinants of strain dynamics following fecal microbiota transplantation

### Data overview

The study dataset comprised 22 independent cohorts recruited in centers in the United States, the Netherlands and Australia, with a total of 316 FMTs conducted in 311 patients suffering from rCDI (n = 62 FMTs26,27,28,32,36), infection with ESBL (n = 59 (refs. 37,38,39)), MetS (n = 50 (refs. 18,25,40)), UC (n = 42 (refs. 29,41,42,43)), anti-PD1 therapy resistance in patients with melanoma (n = 37 (refs. 9,10)), IBS (n = 30 (ref. 44)), Crohn’s disease (n = 18 (ref. 45)), chemotherapy-induced diarrhea in patients with renal carcinoma (n = 10 (ref. 46)), Tourette’s syndrome (n = 5 (ref. 47) and in healthy volunteers (n = 3 (ref. 48)). On average, 4.11 recipient stool samples were available per FMT time series, including baseline samples taken before the intervention (pre-FMT). Overall, 7.9 Terabases (Tb) of sequencing data were analyzed across 1,492 fecal metagenomes, of which 269 (for 76 time series) were generated as part of the present study (for cohorts UC_NL, ESBL_NL, MetS_NL_1 and div_AU).

Three cohorts (UC_NL, MetS_NL_1 and MetS_NL_Koopen) were randomized controlled trials during which a subset of patients received autologous FMTs (transplantation of the recipient’s own stool, n = 33 FMTs). All other FMTs (n = 283) were allogenic, using stool donors. For 228 FMT time series, a full complement of donor baseline, recipient baseline and at least one recipient post-FMT sample were available after filtering.

A full description of all cohorts is provided in Supplementary Table 1, detailed information per FMT time series in Supplementary Table 2 and per-sample information in Supplementary Table 3.

### Sample collection, processing and metagenomic sequencing

Study design and fecal sample collection for cohorts MetS_NL_1 (refs. 18,25), UC_NL41,61 and ESBL_NL37 were described previously. rCDI_AU and UC_AU samples were obtained from a single-center, proof-of-concept, parallel and controlled study in collaboration with the Centre for Digestive Diseases (Sydney, Australia), which aimed to assess donor microbiota implantation in two patients with CDI and three with UC up to 28 days following a 2-day fecal microbiota transplantation infusion via transcolonoscopy and rectal enema. The study is registered with the Australian New Zealand Clinical Trials Registry under ACTRN12614000503628 (Universal Trial no, U1111-1156-5909). Written, informed participant consent and ethical approval were obtained via the Centre for Digestive Diseases Human Research Ethics Committee. Deidentified participant data relevant to the study are provided in Supplementary Tables 2 and 3.

For cohorts MetS_NL_1 and UC_NL, fecal DNA extraction was described in the original studies. DNA from ESBL_NL samples was extracted using the GNOME DNA Isolation Kit (MP Biomedicals) with the following minor modifications: cell lysis/denaturation was performed (30 min, 55 °C) before protease digestion was carried out overnight (55 °C), and RNAse digestion (50 μl, 30 min, 55 °C) was performed after mechanical lysis. After final precipitation, DNA was resuspended in TE buffer and stored at −20 °C for further analysis.

Metagenomic sequencing libraries for MetS_NL_1, UC_NL, ESBL_NL and div_AU samples were prepared to a target insert size of 350–400 base pairs (bp) on a Biomek FXp Dual Hybrid with high-density layout adapters, orbital shaker, static peltier and shaking peltier (Beckman Coulter) and a robotic PCR cycler (Biometra), using SPRIworks HT kits (Beckman Coulter) according to the supplier’s recommendation, with the following modifications: 500 ng of DNA initially, adapter dilution 1:25, kit chemical dilution 1:1 in process. For samples with low-input DNA concentrations, libraries were instead prepared manually using NEBNext Ultra II DNA Library Prep kits with NEBNext Singleplex primers. Libraries were sequenced on an Illumina HiSeq 4000 platform with 2 × 150-bp paired-end reads.

### Public datasets

Based on a literature search, 18 datasets on FMT cohorts that met the following criteria were included in the study: (1) public availability of metagenomic sequencing data in January 2022; (2) sufficient available description to unambiguously match donors and recipients per FMT time series; and (3) no restrictions on data reuse. They were included in this study as RCDI_US_Smillie (n = 22 FMT time series26), RCDI_US_Aggarwala (n = 14 (ref. 28)), RCDI_US_Watson (n = 10 (ref. 32)), RCDI_US_Podlesny (n = 8 (ref. 27)), RCDI_US_Moss (n = 6 (ref. 36)), MetS_NL_Koopen (n = 24 (ref. 40)), UC_US_Damman (n = 6 (ref.43)), UC_US_Nusbaum (n = 4 (ref. 42)), UC_US_Lee (n = 2 (ref. 29)), CD_US_Vaughn (n = 18 (ref. 45)), ABXR_div_Leo (n = 26 (ref. 39)), ABXR_IS_BarYoseph (n = 14 (ref. 38)), IBS_NO_Goll (n = 30 (ref. 44)), MEL_US_Davar (n = 27 (ref. 10)), MEL_US_Baruch (n = 109), REN_IT_Ianiro (n = 10 (ref. 46)), TOU_CN_Zhao (n = 5 (ref. 47)) and CTR_RU_Goloshchapov (n = 3 (ref. 48)). Contextual data, including donor–recipient matchings and information about clinical response, were curated from the study publications and, in some cases, kindly amended by the studies’ original authors on request (Supplementary Tables 13).

### Metagenomic data processing and taxonomic and functional profiling

Metagenomic reads were quality trimmed to remove base calls with a Phred score of <25. Reads were then discarded if they were <45 nucleotides or if they mapped to the human genome (GRCh38.p10) with at least 90% identity over 45 nucleotides. This processing was performed using NGLess62. Taxonomic profiles per sample were obtained using mOTUs v.2 (ref. 63). For functional profiling, reads were mapped against the Global Microbial Gene Catalog v.1 gut subcatalogue (gmgc.embl.de64) with a minimum match length of 45 nucleotides with at least 97% identity, and summarized based on antimicrobial resistance gene (ARG) annotations and Kyoto Encyclopedia of Genes and Genomes orthologs (KOs) via eggNOG annotations65. Based on the resulting KO profiles, GMMs66 were quantified in each sample using omixer-rpmR (v.0.3.2)67. Taxonomic and GMM profiles per sample, normalized by read depth, are available in Supplementary Tables 7 and 8.

### MAGs

We demarcated MAGs from samples of studies MetS_NL_1, UC_NL, ABXR_NL, div_AU, RCDI_US_Smillie, RCDI_US_Moss, UC_US_Damman, UC_US_Nusbaum, UC_US_Lee and CD_US_Vaughn using several complementary strategies to obtain both high resolution from sample-specific assemblies and deep coverage of lowly abundant species from coassemblies of multiple samples. Unless otherwise indicated, all tools in the following were run with default parameters.

To generate single-sample MAGs, fecal metagenomes were assembled individually using metaSPAdes v.3.12.0 (ref. 68), reads were mapped back to contigs using bwa-mem v.0.7.17 (ref. 69) and contigs were binned using metaBAT v.2.12.1 (ref. 70). Multisample MAGs were built for each cohort separately. Reads were first coassembled using megahit v.1.1.3 (ref. 71) and mapped back to contigs using bwa-mem v.0.7.17. Coassembled contigs were then binned using both CONCOCT v.0.5.0 (ref. 72) and metaBAT v.2.12.1. The resulting coassembled MAG sets were further refined using DAS TOOL73 and metaWRAP74. In total, 47,548 MAGs were demarcated using these five approaches (single-sample MAGs, multisample coassembled CONCOCT, metaBAT2, DAS TOOL and metaWRAP MAGs). In addition, we included 25,037 high-quality reference genomes from the proGenomes database75,76 in downstream analyses.

Genome quality was estimated using CheckM77 and GUNC v.0.1 (ref. 78), and all genomes were taxonomically classified using GTDB-tk79. Open reading frames (ORFs) were predicted using prodigal80 and annotated via prokka workflow v.1.14.6 (ref. 81). Orthologs to known gene families were detected using eggNOG-mapper v.1 (ref. 82). ARGs were annotated using a workflow combining information from databases CARD v.3.0.0 (via rgi v.4.2.4 (ref. 83) and ResFams v.1.2.2 (ref. 84), as described previously76. The ‘specI’ set of 40 near-universal single-copy marker genes were detected in each genome using fetchMG85.

The full set of generated MAGs and contextual data are available via Zenodo (DOI 10.5281/zenodo.5534163 (ref. 86)).

### Genome clustering, species metapangenomes and phylogeny

Genomes were clustered into species-level groups using an ‘open-reference’ approach in multiple steps. Initial prefiltering using lenient quality criteria (CheckM-estimated completeness ≥70%, contamination ≤25%; additional criteria were applied downstream) removed 57.7% of MAGs. The remaining 20,093 MAGs were mapped to the clustered proGenomes v.1 (ref. 75) and mOTUs v.2 (ref. 63) taxonomic marker gene databases using MAPseq v.1.2.3 (ref. 87). A total of 17,720 MAGs were confidently assigned to a ref-mOTU (specI cluster) or meta-mOTU based on the following criteria: (1) detection of at least 20% of the screened taxonomic marker genes and (2) a majority of markers assigning to the same mOTU at a conservative MAPseq confidence threshold of ≥0.9.

In an independent approach, quality-filtered MAGs and reference genomes were also clustered by average nucleotide identity (ANI) using a modified and scalable reimplementation of the dRep workflow88. Using pairwise distances computed with mash v.2.1 (ref. 89), sequences were first preclustered to 90% mash-ANI using the single-linkage algorithm, asserting that all genome pairs sharing ≥90% mash-ANI were grouped together. Each mash precluster was then resolved to 95 and 99% average linkage ANI clusters using fastANI v.1.1 (ref. 90). For each cluster, a representative genome was picked as either the corresponding reference specI cluster representative in the proGenomes database or the MAG with the highest dRep score (calculated based on estimated completeness and contamination). Genome partitions based on 95% average linkage ANI clustering and specI marker gene mappings matched almost perfectly, at an adjusted Rand index of >0.99. We therefore defined a total of 1,089 species-level clusters (‘species’) from our dataset (Supplementary Table 4), primarily based on marker gene mappings to precomputed ref-mOTUs (or specI clusters, n = 295) and meta-mOTUs (n = 528), and as 95% average linkage ANI clusters for genomes that did not map to either of these databases (n = 233).

Species pangenomes were generated by clustering all genes within each species-level cluster at 95% amino acid identity, using Roary 3.12.0 (ref. 91). Spurious and putatively contaminant gene clusters (as introduced by misbinned contigs in MAGs) were removed by asserting that the underlying gene sequences originated (1) from a reference genome in the proGenomes database or (2) from at least two independent MAGs, assembled from distinct samples or studies. To account for incomplete genomes, ‘extended core genes’ were defined as gene clusters present in >80% of genomes in a species-level cluster. If too few gene clusters satisfied this criterion, as was the case for some pangenomes containing many incomplete MAGs, the 50 most prevalent gene clusters were used instead. Representative sequences for each gene cluster were picked as ORFs originating from specI representative genomes (that is, high-quality reference genomes), or otherwise as the longest ORF in the cluster.

A phylogenetic tree of species-level cluster representatives was inferred based on the ‘mOTU’ set of ten near-universal marker genes63. Marker genes were aligned in amino acid sequence space across all species using Muscle v.3.8.31 (ref. 92), concatenated and then used to construct a species tree with FastTree2 (v.2.1.11)93 with default parameters.

### Inference of microbial strain populations

Metagenomic reads for each sample were mapped against gene cluster representative sequences for all species pangenomes using bwa-mem v.0.7.17 (ref. 69). Mapped reads were filtered for matches of ≥45 bp and ≥97% sequence identity, sorted and filtered against multiple mappings using samtools v.1.7 (ref. 94). Horizontal (‘breadth’) and vertical (‘depth’) coverage of each gene cluster in each sample were calculated using bedtools v.2.27.1 (ref. 95).

A species was considered present in a sample if at least three mOTU taxonomic marker genes were confidently detected either via the mOTU v.2 profiler (for specI clusters and meta-mOTUs) or based on pangenome-wide read mappings (for non-mOTU species-level clusters). Gene clusters within each pangenome were considered present in a sample if (1) the species was detectable (see above), (2) horizontal coverage exceeded 100 bp and 20% of the representative gene’s length and (3) average vertical coverage exceeded 0.5. Gene clusters were considered confidently absent if they did not attract any mappings in samples where the species’ set of extended core genes (see above) was covered at >1 median vertical coverage (that is, present with high confidence). Using these criteria, strain population-specific gene content profiles were computed for each species in each sample.

Raw microbial SNVs were called from uniquely mapping reads using metaSNV v.1.0.3 (ref. 96) with permissive parameters (-c 10 -t 2 -p 0.001 -d 1000000). Candidate SNVs were retained if they were supported by two or more reads each in two or more samples in which the focal gene cluster was confidently detected (see above), before differential downstream filtering. At multiallelic positions the frequency of each observed allele (A, C, G, T) was normalized by the total read depth for all alleles.

Based on these data, strain populations were represented based on both their specific gene content profile and SNV profile in each sample.

Each species’ local strain population diversity (SPD) and allele distances (AD) between strain populations across samples were estimated as follows. SPD was calculated based on the inverse Simpson index of allele frequencies p(ACGT) at each variant position i in the extended core genome (nvar), normalized by total horizontal coverage (number of covered positions) covhor:

$$\mathrmSPD = \frac{{\mathop \sum\nolimits_i = 1^n_\mathrmvar \left( p_\mathrmA^2 + p_\mathrmC^2 + p_\mathrmG^2 + p_\mathrmT^2 \right)^ – 1 – 1 }}{\mathrmcov_\mathrmhor}$$

Thus defined, SPD can be interpreted as the average effective number of nondominant alleles in a strain population. SPD ranges between 0 (only one dominant strain detected—that is, no multiallelic positions) and 3 (all four possible alleles present at equal proportions at each variant position). Normalization by total horizontal coverage, covhor of the extended core genome ensures that values are comparable between samples even if a species’ coverage in a sample is incomplete.

Intraspecific ADs between strain populations across samples were calculated as the average Euclidean distance between observed allele frequencies at variant positions in the species’ extended core genome, requiring at least 20 variant positions with shared coverage between samples. If a species was not observed in a sample, ADs to that sample were set to 1.

### Quantification of strain-level outcomes

Colonization by donor strains, persistence of recipient strains and influx of novel strains (environmental or previously below detection limit) in the recipient microbiome following FMT were quantified for every species based on determinant microbial SNVs and gene content profiles using an approach extending previous work25,97. In total, 261 FMT time series (228 allogenic and 33 autologous transfers) for which a donor baseline (in allogenic FMTs; ‘D’), a recipient pre-FMT baseline (‘R’) and at least one recipient post-FMT (‘P’) sample were available were taken into account, and each FMT was represented as a D-R-P sample triad. If available, multiple time points post FMT were scored independently. By definition, because no donor samples were available for autologous FMTs, recipient pre-FMT samples were used instead. An overview of potential strain-level FMT outcomes is provided in Fig. 1c,d.

For each D-R-P sample triad, conspecific strain dynamics were calculated if a species was observed in all three samples (see above) with at least 100 informative (determinant) variant positions either covered with two or more reads or confidently absent (see below). Donor determinant alleles were defined as variants unique to the donor (D) relative to the recipient pre-FMT (R) sample, and vice versa. Post-FMT determinant alleles were defined as variants unique in P relative to both D and R. Given that intraspecific fecal strain populations are often heterogeneous—that is, consist of more than one strain per species—multiple observed alleles at the same variant position were taken into account. In addition, if a gene containing a putative variant position was absent from a sample although the species’ extended core genome was detected, the variant was considered ‘confidently absent’ and treated as informative (and potentially determinant) as well, thereby taking into account differential gene content between strains.

The fractions of donor and recipient strains post FMT were quantified based on the detection of donor- and recipient-determinant variants across all informative positions in the P sample. The fraction of novel strains (environmental or previously below detection limit in donor and recipient) was quantified as the fraction of post-FMT determinant variants. Based on these three readouts (fraction of donor, recipient and novel strains) and cutoffs previously established by Li et al.25, FMT outcomes were scored categorically as ‘donor colonization’, ‘recipient persistence’, ‘donor–recipient coexistence’ or ‘influx of novel (previously undetected) strains’ for every species (Supplementary Table 5).

In addition to conspecific strain dynamics (that is, where a species was present in D, R and P), we also quantified FMT outcomes that involved the acquisition or loss of entire strain populations. For example, if a species was present in the recipient at baseline but not post FMT, this was considered a ‘species loss’ event. See Fig. 1c and Supplementary Table 5 for a full overview of how different FMT outcome scenarios were scored.

To assert the accuracy of our approach, we simulated FMT time series by shuffling (1) the donor sample, (2) the recipient pre-FMT sample or (3) both. Randomizations were stratified by subject (accounting for the fact that some donors were used in multiple FMTs and that some recipients received repeated treatments) and geography. For each observed D-R-P sample triad, we simulated ten triads per each of the above setups.

Outcomes were further summarized across species by calculating a series of strain population-level metrics for each FMT, defined as follows.

Persistence index: average fraction of persistent recipient strains among all species observed post FMT (that is, fraction of post-FMT strain populations attributable to recipient baseline strains).

Colonization index: average fraction of donor strains among all species post FMT.

### Modeling and prediction of FMT outcomes

We explored a large set of covariates as putative predictor variables for FMT outcomes, grouped into the following categories: (1) host clinical and procedural variables (for example, FMT indication, pre-FMT bowel preparation, FMT route and so on); (2) community-level taxonomic diversity (species richness, community composition and so on); (3) community-level metabolic profiles (abundance of specific pathways); (4) abundance profiles of individual species; (5) strain-level outcomes for other species in the system; and (6) focal species characteristics, including strain-level diversity; see Supplementary Table 6 for a full list of covariates and their definitions. We further classified covariates as either predictive ex ante variables (that is, knowable before the FMT is conducted) or post hoc variables (that is, pertaining to the post-FMT state, or the relation between pre- and post-FMT states).

We built two types of model to predict FMT strain-level outcomes based on these covariates: (1) FMT-wide models, using summary outcome metrics across all species in a time series (persistence index, colonization index; see above) as response variables; and (2) per-species models for 307 species observed in ≥50 FMTs, using each species’ strain-level outcome in every scored time series as response variable. Unless otherwise indicated, the last available time point for each FMT time series was used. Models were built for each covariate category separately, as well as for combinations of all ex ante and all post hoc variables, respectively.

Given that the number of covariates greatly exceeded the number of available FMT time series, and that several covariates were correlated with each other (Supplementary Fig. 3), FMT outcomes were modeled using ten times fivefold cross-validated LASSO-regularized regression, as implemented in the R package glmnet (v.4.1.3)98. Regression coefficients were chosen at one standard error from the cross-validated minimum lambda value and averaged across validation folds.

Linear LASSO regression was used to model outcomes with continuous response variables, both for FMT-wide outcomes (persistence index and soon) and for the fraction of colonizing, persisting and coexisting strains per species across FMTs. For linear models, R2 of predictions on test sets was averaged across validation folds. Moreover, logistic LASSO regression was used to additionally model binarized FMT outcomes per species, defined as recipient strain resilience, recipient strain turnover and donor strain takeover, based on further summarizing outcome categories in Supplementary Table 5. For logistic models, accuracy was assessed as area under the receiver operating characteristic curve (AUROC) averaged across validation folds.

### Statistical analyses

Association of clinical outcomes (excluding a subset of cohorts for which clinical success was not reported; Supplementary Table 3) with FMT strain-level outcomes was tested using Wilcoxon tests (responders versus nonresponders), and also by sequential ANOVA on linear regression models (accounting for additional variables), in each case followed by Benjamini–Hochberg correction for multiple hypothesis tests. Differences in strain-level outcomes between species across taxonomic clades and inferred species phenotypes were tested using ANOVA on linear regression models.

### Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.

RELATED ARTICLES