Transcriptome là gì


Non-steroidal anti-inflammatory drugs (NSAIDs) are among the most frequently used classes of medications in the world, yet they induce an enteropathy that is associated with high morbidity và mortality. A major limitation lớn better understanding the pathophysiology và diagnosis of this enteropathy is the difficulty of obtaining information about the primary site of injury, namely the distal small intestine. We investigated the utility of using mRNA from exfoliated cells in stool as a means to surveil the distal small intestine in a murine model of NSAID enteropathy. Specifically, we performed RNA-Seq on exfoliated cells found in feces and compared these data lớn RNA-Seq from both the small intestinal mucosa & colonic mucosa of healthy control mice or those exhibiting NSAID-induced enteropathy. Global ren expression analysis, data intersection, pathway analysis, and computational approaches including linear discriminant analysis (LDA) và sparse canonical correlation analysis (CCA) were used khổng lồ assess the inter-relatedness of tissue (invasive) và stool (noninvasive) datasets. These analyses revealed that the exfoliated cell transcriptome closely mirrored the transcriptome of the small intestinal mucosa. Thus, the exfoliome may serve sầu as a non-invasive means of detecting and monitoring NSAID enteropathy (và possibly other gastrointestinal mucosal inflammatory diseases).

Bạn đang xem: Transcriptome là gì

Non-steroidal anti-inflammatory drugs (NSAIDs) are among mỏi the most frequently consumed pharmaceuticals worldwide because of their anti-inflammatory, anti-neoplastic, and analgesic effects. Their use can result in an enteropathy that has an alarmingly high rate of morbidity and mortality. In the United States alone, NSAID enteropathy results in approximately 100,000 hospitalizations và 16,500 deaths each year1. An additional 2/3 of both short- & long-term NSAID web4_users develop subclinical or undiagnosed distal small intestinal lesions2. Although detection và management of NSAID-induced lesions of the proximal GI tract (i.e., gastropathy) are well documented, diagnosis & treatment of NSAID-induced damage khổng lồ the GI tract distal lớn the duodenum (also known as NSAID enteropathy, affecting primarily the distal jejunum và ileum) remain elusive3,4. This is noteworthy because the incidence of NSAID enteropathy is expected khổng lồ increase as a result of greater use of NSAIDs lớn treat rising numbers of inflammatory conditions, lớn meet the needs of aging populations in North America và Europe, & for their anti-neoplastic effects5. The lower GI tract of multiple mammalian species is affected by NSAIDs in a similar manner in terms of anatomic location, pathological findings, & severity of clinical signs6,7,8.

The pathophysiology of NSAID enteropathy is complex & poorly understood9. Deleterious effects of NSAIDs on the intestinal mucosa including enterocyte cell death, increased mucosal permeability, và interaction of the damaged mucosa with luminal contents including bacteria (i.e., GI microbiota) and bacterial products or components such as lipopolysaccharide (LPS)4,10 has been proposed. The resulting inflammatory cascade is mediated by the innate immune response khổng lồ LPS và several pro-inflammatory cytokines including tumor necrosis factor (TNF), interleukin (IL)-1, and IL-611,12,13. Although the GI microbiota has recently been implicated as an important contributor to NSAID enteropathy, the precise mechanisms of host-microbiota interactions remain lớn be elucidated14,15,16,17,18.

An important limitation to understanding the pathogenesis of NSAID enteropathy is the difficulty in obtaining longitudinal (sequential) data from individuals regarding intestinal function và health. A great clinical and investigative need exists to develop non-invasive sầu methods khổng lồ characterize the health & function of the GI tract distal to the stomach lớn more effectively identify, study, and manage NSAID enteropathy. A potential strategy lớn address this limitation is the use of exfoliated intestinal epithelial cells (IECs) và other cell types found in voided stool. Approximately 1/3 of human colonic epithelial cells (up to 1010 cells in an adult) are exfoliated & shed in feces each day19. Isolation and sequencing of the mRNA (host transcriptome) from exfoliated cells has been validated in the context of colon carcinogenesis in rats & humans, và in characterizing human neonatal gastrointestinal developmentđôi mươi,21,22,23,24. Exfoliated cells, however, have sầu not been used khổng lồ evaluate a disease affecting the small intestine. Thus, the objective of this study was lớn determine whether exfoliated cells could be used as a non-invasive method for detecting và studying NSAID enteropathy in a murine model. Specifically, we performed RNA-Seq on colonic và small intestinal mucosa & exfoliated host cells in feces. We then applied computational approaches, e.g., linear discriminant analysis (LDA) và sparse canonical correlation analysis (CCA), lớn analyze the inter-relatedness of these data. The goals of these studies were to lớn provide proof-of-principle that the exfoliated cell transcriptome (i.e., the exfoliome) could be used to lớn gain information about NSAID-induced small intestinal injury. Our specific aims were: 1) to lớn determine whether the transcriptome of exfoliated cells reflected gen expression of the small intestinal mucosa; 2) khổng lồ demonstrate that the transcriptome of exfoliated cells can be used khổng lồ differentiate healthy and diseased phenotypes; &, 3) to lớn generate hypotheses regarding key biological pathways & processes involved in the pathogenesis of NSAID enteropathy.

The lumen of the GI tract is a formidable environment for RNA transcripts, particularly the small intestine because of the abundance of both host và microbial enzymes and the longer transit time in the SI relative to lớn the colon. Thus, after extracting RNA from tissues and exfoliated cells we examined RNA quality owing khổng lồ the potential for degradation of mRNA from exfoliated SI cells in stool as it passes through the GI tract. As expected, Bioanalyzer results revealed lower chất lượng RNA in the exfoliated cells than in the tissue (Figure S1a–c). Bioanalyzer traces show that the majority of RNA in the exfoliated cell samples is of microbial origin (23S và 16S rRNA subunits) (Figure S1b). However, due khổng lồ the oligo dT probe used in the first step of library construction, the mouse transcripts were selectively targeted for cDNA production & subsequent library nội dung. Fastqc results of the subsequent sequenced transcripts revealed high fidelity & chất lượng with all fecal & tissue samples (Figure S2a,b).

Sequencing of these data revealed that the RNA sequencing reads for SI mucosa, colonic mucosa, & exfoliated cells mapped to lớn an average of 19,324 genes, đôi mươi,743 genes, & 13,944 genes per sample, respectively (Figure S3a–c). Genes present in low abundance (i.e., ≤4 animals or ≤50 times) across all samples were removed from all datasets & the remaining genes subjected khổng lồ downstream analysis. This reduced the number of transcripts for downstream analysis in the SI to lớn an average of 17,229 genes, the colon data lớn 17,244 genes, and the exfoliated data to 10,865 genes per sample. Although filtering reduced the total number of genes in the SI and colon (by 11% và 17%, respectively) less than in the exfoliated cells (22%), the total number of reads was negligibly affected in all datasets (SI reduced from 273,065,200 reads across all samples lớn 273,026,424 – a 0.001% reduction; colon data reduced from 389,222,037 lớn 389,122,199 - a 0.003% reduction; exfoliated cell data reduced from 38,292,084 to 38,160,589 – a 0.003% reduction) (Figure S3d–f).

After filtering lớn remove sầu genes present in low abundance, we examined scatter plots of log(2) counts per million (CPM) for each gen in the 2 treatment groups (i.e., NSAID & control), comparing the exfoliome lớn the SI & colonic transcriptome in a pairwise manner (Figure S4a,b). There was strong and significant correlation of the CPM data for each of the pairwise comparisons (Spearman’s correlation coefficient; R value > 0.8 and P 1A,B). There was obvious variation between sources (i.e., exfoliated cells, SI, or colonic RNA) in total mammalian reads và distribution of counts. This difference was thought lớn be due to microbial RNA contamination of the RNA extracted from exfoliated cells (Figure S1b) resulting in fewer reads mapping khổng lồ the mouse genome. To confirm this, we extracted total counts of 532 genes that have been previously identified as housekeeping genes from each sample và plotted those relative sầu to lớn total ren counts25. Examination of the relative abundance of these genes in each sample revealed that between-sample total count differences were represented by similar magnitudes of differences in abundance of these 532 house-keeping genes, indicating these differences were due khổng lồ smaller library kích thước attributable lớn mammalian transcripts and not to sequencing artifact (Fig. 1C).


The exfoliome contains fewer reads attributable khổng lồ the mouse genome than the tissue transcriptomes due to bacterial RNA contamination: (A) Number of mammalian reads per sample for each animal & data source colored by treatment group. (B) Log(2) counts per gene per sample across all treatment groups from the sequenced RNA colored by treatment group. (C) Log(2) total ren reads of 532 murine house-keeping genes (black) và all other genes (grey) per sample across all animals and treatment groups from each data source.

Given the magnitude of differences in library kích thước attributable to mammalian transcripts between the exfoliome & the tissue transciptome, it was necessary lớn perkhung the remainder of the analyses separately for each source of RNA (i.e., SI, colon, or exfoliated cells). To account for between-sample variation in read-counts, RNA-Seq data for each datamix were normalized with edgeR accounting for group effects using the function calcNormFactors & the upper-quartile method. Total gene-counts & boxplots of the number of reads/ren for each sample of the normalized data for both tissue và exfoliated cell datasets are shown in Fig. 2A–F. Variability in abundance of post-normalization total house-keeping genes was also improved (Figure S5). Prior khổng lồ identifying differentially expressed (DE) genes, we assessed biological variability in the exfoliome relative khổng lồ the tissue transcriptomes. Biological coefficient of variation (BCV) versus the mean log counts per million (CPM) & multi-dimensional scaling (MDS) plots were used lớn visually assess the similarity of the samples within each treatment group (Fig. 3A–F). These results demonstrate a relatively high degree of variation in the exfoliome (common dispersion = 0.592 and BCV = 0.769) (Fig. 3B) as compared to both tissue transcriptomes (Fig. 3A và C), with a common dispersion of 0.143 & BCV of 0.379 for the SI and common dispersion of 0.126 and BCV of 0.329 for the colon. Notably, MDS based on BCV revealed clear separation of the treatment groups in both the SI transcriptome & exfoliome but not the colonic transcriptome (Fig. 3D–F).


Raw data after filtering & normalization show that the between sample variation in exfoliated cell reads is improved và similar to lớn tissue reads. Total ren counts after normalization for each sample across all treatment groups from the sequenced RNA extracted from (A) colonic mucosa, (B) exfoliated cells & (C) SI mucosa. Normalized log(2) counts per gen per sample across both treatment groups from sequenced RNA extracted from (D) colonic mucosa, (E) exfoliated cells & (F) SI mucosa.

Xem thêm:


SI transcriptome và exfoliome cluster by treatment group in contrast khổng lồ colonic transcriptome & biological variability is higher in the exfoliome than the tissue transcriptomes. Biological coefficient of variation (BCV) versus the mean log counts per million (CPM) of the SI transcriptome (A), exfoliome (B) & colonic transcriptome (C). (D) Treatment-based multi-dimensional scaling (MDS) plots of the SI transcriptome & that of the exfoliome (E) và colon (F).

Prior lớn analyzing these data in order khổng lồ derive sầu biological meaning, we first wished lớn determine the anatomic origin of the cellular derived from the exfoliome. In order khổng lồ determine the source of this we extracted the counts of genes previously identified and expressed predominantly in specific anatomic locations (i.e. stomach, pancreas, small intestine, và colon). Interestingly, we found that the exfoliome contained virtually no reads from genes representing the stomach or pancreas. In contrast, there was a clear arising from both the colon và small intestine (Fig. 4A). As expected genes representing the colon và small intestine were heavily represented in the transcriptomes arising from those locations with some overlap. Similarly, in addition khổng lồ anatomic origin we also wished to determine the cell types represented in the exfoliome. Clearly, the intestinal mucosa is comprised not only of IECs but also stem cells, crypt cells, goblet cells, Paneth cells (SI), as well as a host of infiltrating immune cells depending on depth of the sample (i.e., lamimãng cầu propria) & disease state of the GI tract (e.g., inflammation vs. homeostasis). To try khổng lồ determine the cell types present in these data, we reviewed the literature for marker genes expressed either solely by a specific cell type or at least highly enriched in a specific cell type26,27,28,29,30,31,32,33,34,35,36,37,38. In particular, we extracted the numbers of reads in each sample across all 3 datasets for the following cell types: intestinal stem cells, IECs, crypt base columnar cells, Paneth cells, tuft cells, goblet cells, macrophages, lymphocytes, neutrophils, & smooth muscle. A list of the genes used as biomarkers for each of these cell types is shown in Table S1. Interestingly, we found that all cell types were present in all datasets as identified by the presence of at least 2 marker genes per cell type (Fig. 4B). Visually there were minimal differences aý muốn the 3 datasets with the expected exception of fewer reads in the exfoliated cell data & absence of a few marker genes of intestinal stems cells in the exfoliome with concurrent low expression in the tissue transcriptomes. These data suggest that the mucosal transcriptome and exfoliome represent siggocnhintangphat.coms from similar cell types that comprise not only IECs but reads from the diverse array of cell types expected to lớn be found in the intestinal mucosa.


(A) The exfoliome arises from cells sloughed from both the small intestine và colon và comprises reads from the diverse array of cell types expected khổng lồ be found in the intestinal mucosa (A) Heatmap showing counts of genes that are reported lớn be primarily expressed at specific anatomic locations (stomach, pancreas, small intestine, colon). All genes with counts greater than 400 are colored dark blue. (B) Heatmaps showing counts of biomarker genes from each sample và each data source (orange = gen not expressed).

After characterizing the cellular source of the siggocnhintangphat.coms derived from these datasets, we examined each dataset for alterations induced by NSAIDs by comparing the transcriptome (or exfoliome) between the control group and NSAID group. In human subjects và preclinical models, NSAID-induced lower GI damage primarily occurs in the distal jejunum and ileum3,4. As expected, we observed marked pathological findings in the distal SI but no notable microscopic abnormalities in the colon in NSAID-treated mice (Fig. 5A). Therefore, RNA extracted from SI mucosal scrapings of NSAID-treated subjects should demonstrate marked mucosal pathology whereas the RNA from colonic mucosal scrapings should reflect minimal pathology.


Microscopic pathology reveals NSAID injury is confined khổng lồ the SI. Despite great overlap between the three RNA-Seq datasets, the exfoliome distinguishes between NSAID & control animals similar lớn the SI transcriptome, whereas the colonic transcriptome does not. (A) Microscopic pathology scores from colon & small intestinal mucosa in control mice & NSAID-treated mice. (B) Venn diagram showing intersection of gen lists among muốn the datasets. (C) Multi-dimensional scaling plot of each sample color-coded by source & treatment group. Inphối of panel C enlarged khổng lồ show degree of separation of groups in the SI transcriptome và lachồng of clear separation in colonic transcriptome.

In order lớn determine whether the exfoliome more closely resembled the colonic transcriptome or the SI transcriptome, we first crudely examined the intersection of gene lists from each source. This analysis revealed >90% overlap amuốn the 3 datasets in terms of presence and absence of genes (Fig. 5C). Despite this overlap of genes, non-metric multidimensional scaling (NMDS) plots demonstrated differences aước ao these 3 datasets (Fig. 5C and inset). Although the exfoliated cell data differed from the SI và colon data because of smaller library size & fewer mammalian reads potentially resulting from degradation of RNA in the GI tract, evidence of clustering of the treatment groups was observed in the SI & exfoliated cell data but was absent in the colonic data (Fig. 5C and inset). Analysis of similarity (ANOSIM) based on the Bray-Curtis dissimilarity metric quantitatively demonstrated differences between NSAID và control groups for the SI (R value = 0.760; P = 0.008) & exfoliated cells (R value 0.280; P = 0.024) but not for the colonic data (R value 0.194; P = 0.090).

To further examine the interdependent relationship between these 3 transcriptomic profiles, we utilized sparse CCA, a novel multivariate statistical analysis approach. Sparse CCA is a dimensionality reduction technique that identifies the fewest numbers of genes that show the greachạy thử amount of correlation between datasets according khổng lồ specific optimality criteria. Although the sparse CCA plots should not be assigned any particular biological interpretation, they can be considered a stringent method for determining correlation of large datasets39. As revealed by NMDS plots, sparse CCA plots demonstrated that the transcriptome from exfoliated cells correlated well with the SI transcriptome in these mice, and that the SI & exfoliated cell datasets discriminated NSAID-treated from control groups, whereas the colonic mucosal transcriptome did not (Fig. 6A–D).

Sparse canonical correlation analysis (CCA) reveals that the global transcriptome profiles from exfoliated cells correlates well with the transcriptome protệp tin of the SI. In contrast, the colonic transcriptome data vì chưng not discriminate well between treatment groups. Sparse CCA plots positioned by 1st & 2nd component scores from (exfoliated cells và colored by the 1st component SI scores, (B) SI và colored by 1st component scores from exfoliated cells, (C) exfoliated cells and colored by the 1st component colon scores và (D) colonic mucosa và colored by the 1st component exfoliated scores.

We next examined the similarities & differences in the gene expression profiles from exfoliated cells compared with those from the scraped intestinal mucosa. We identified DE genes between control and NSAID-treated mice for each datamix. Interestingly, both the exfoliome & SI transcriptome had >1000 DE genes (FDR P 7A). Venn diagrams revealed sparse overlap (12%) of the DE expressed genes between the SI transcriptome & the exfoliome and even less overlap between the colonic transcriptome và the exfoliome (6%; Fig. 7A). Despite this sparse overlap, the pathways enriched in the exfoliome and SI transcriptome were similar whereas there was much less similarity between the exfoliome and colonic transcriptome. Specifically, IPA Ingenuity Knowledgebase ( pathway analysis revealed both SI và exfoliated cell datasets exhibited similar occupancy & predicted directionality (Z-score) in the canonical pathways represented (Fig. 7B). In contrast, few pathways were represented in the colonic data (Fig. 7B). For example, Toll-lượt thích receptor signaling (which is known lớn play a crucial role in the pathogenesis of NSAID enteropathy)9,10,13,18,40,41 was upregulated by NSAID administration in both the exfoliome & small intestinal transcriptome but genes related to lớn this pathway were not altered in the colonic transcriptome resulting in no occupancy of this canonical pathway. Indeed, the proportion of pathways represented in the colon (31%; 21/67) was significantly less than that of the exfoliome (84%; 56/67; P S6). Specifically, there were significantly fewer upstream regulators expressed in the colon (41%; 780/1888) than in either the SI (73%; 1380/1888; P 8A–E)42. These plots confirmed lack of overlap between DE genes within each dataset and show that the DE genes within the exfoliome had a greater effect-kích thước than those within the tissue.

The exfoliated cell transcriptome is similar lớn the tissue transcriptome as shown by overlapping gen lists & pathways. (A) Venn diagram of the intersection of differentially expressed genes found in the exfoliome và tissue transcriptomes. (B) Heat maps of the Z-scores of the canonical pathways khổng lồ which the differentially expressed genes between control và NSAID-treated animals were mapped from the SI transcriptome (left column), exfoliome (middle column) & the colonic transcriptome (right column).

MA plots demonstrate the expression of genes identified as differentially expressed (DE) in the exfoliome and in the tissue transcriptomes. MA plot of genes from the (A) SI transcriptome, (B) exfoliome và (C) colon transcriptome with DE genes (FDR Full size image


$$(u_1,v_1)=mathop margmaxlimits_u,v, mCorr(u^Tx,v^Ty)=mathop margmaxlimits_u,vfracu^T mSigma _xy u sqrt(u^T mSigma _xx u )(v^T mSigma _yy u ),$$