Interpretable scRNA-seq Trajectory DE with `scLANE`

Author

Affiliation

Jack R. Leary

University of Florida

Published

April 16, 2024

1 Introduction

In this tutorial we’ll walk through a basic trajectory differential expression analysis. We’ll use the scLANE R package, which we developed with the goal of providing accurate and biologically interpretable models of expression over pseudotime. At the end are some best-practices recommendations, along with a short list of references we used in developing the method & writing the accompanying manuscript. If you want to skip the preprocessing steps and get right into the analysis, head to Section 5.

2 Libraries

If you haven’t already, install the development version (currently v0.7.9) of scLANE from the GitHub repository.

Code

remotes::install_github("jr-leary7/scLANE")

Next, we’ll load the packages we need to process, analyze, & visualize our data.

Code

library(dplyr)           # data manipulation
library(scLANE)          # trajectory DE 
library(Seurat)          # scRNA-seq tools
library(ggplot2)         # plot creation 
library(patchwork)       # plot combination
library(slingshot)       # pseudotime estimation
library(reticulate)      # Python interface
library(ComplexHeatmap)  # heatmaps
rename <- dplyr::rename

3 Helper functions

We’ll also define a utility function to make our plots cleaner to read & easier to make.

Code

guide_umap <- function(key.size = 4) {
  ggplot2::guides(color = ggplot2::guide_legend(override.aes = list(size = key.size,
                                                                    alpha = 1, 
                                                                    stroke = 0.25)))
}

And consistent color palettes will make our plots easier to understand.

Code

palette_cluster <- paletteer::paletteer_d("ggsci::default_jama")
palette_celltype <- paletteer::paletteer_d("ggsci::category20_d3")
palette_heatmap <- paletteer::paletteer_d("wesanderson::Zissou1")

4 Data

We’ll load the well-known pancreatic endocrinogenesis data from Bastidas-Ponce et al (2019), which comes with the scVelo Python library & has been used in several pseudotime inference / RNA velocity method papers as a good benchmark dataset due to the simplicity of the underlying trajectory manifold.

Code

import scvelo as scv
adata = scv.datasets.pancreas()

The AnnData object contains data that we’ll need to extract, specifically the spliced & unspliced mRNA counts matrices (stored in AnnData.layers) and the cell-level metadata (which is in AnnData.obs).

Code

adata

AnnData object with n_obs × n_vars = 3696 × 27998
    obs: 'clusters_coarse', 'clusters', 'S_score', 'G2M_score'
    var: 'highly_variable_genes'
    uns: 'clusters_coarse_colors', 'clusters_colors', 'day_colors', 'neighbors', 'pca'
    obsm: 'X_pca', 'X_umap'
    layers: 'spliced', 'unspliced'
    obsp: 'distances', 'connectivities'

4.1 Conversion from Python

The reticulate package allows us to pass the counts matrices & metadata from Python back to R.

Code

spliced_counts = adata.layers['spliced'].toarray()
unspliced_counts = adata.layers['unspliced'].toarray()

Note

While downloading this dataset requires a Python installation as well as the installation of the scVelo Python library (and its dependencies), running scLANE is done purely in R & requires no Python whatsoever.

We’ll use the spliced mRNA counts as our default assay, and also define a new assay containing the total (spliced + unspliced) mRNA in each cell. Lastly, we remove genes with non-zero spliced mRNA in 3 or fewer cells.

Code

spliced_counts <- Matrix::Matrix(t(py$spliced_counts), sparse = TRUE)
unspliced_counts <- Matrix::Matrix(t(py$unspliced_counts), sparse = TRUE)
rna_counts <- spliced_counts + unspliced_counts
colnames(rna_counts) <- colnames(spliced_counts) <- colnames(unspliced_counts) <- py$adata$obs_names$to_list()
rownames(rna_counts) <- rownames(spliced_counts) <- rownames(unspliced_counts) <- py$adata$var_names$to_list()
spliced_assay <- CreateAssayObject(counts = spliced_counts)
spliced_assay@key <- "spliced_"
unspliced_assay <- CreateAssayObject(counts = unspliced_counts)
unspliced_assay@key <- "unspliced_"
rna_assay <- CreateAssayObject(counts = rna_counts)
rna_assay@key <- "rna_"
meta_data <- py$adata$obs %>% 
             mutate(cell_name = rownames(.), 
                    .before = 1) %>% 
             rename(celltype = clusters, 
                    celltype_coarse = clusters_coarse) %>% 
             mutate(nCount_spliced = colSums(spliced_counts), 
                    nFeature_spliced = colSums(spliced_counts > 0), 
                    nCount_unspliced = colSums(unspliced_counts), 
                    nFeature_unspliced = colSums(unspliced_counts > 0), 
                    nCount_RNA = colSums(rna_counts), 
                    nFeature_RNA = colSums(rna_counts > 0))
seu <- CreateSeuratObject(counts = spliced_assay, 
                          assay = "spliced", 
                          project = "Mm_Panc_Endo", 
                          meta.data = meta_data)
seu@assays$unspliced <- unspliced_assay
seu@assays$RNA <- rna_assay
seu <- seu[rowSums(seu@assays$spliced) > 3, ]

4.2 Preprocessing

We preprocess the counts using a typical pipeline with QC, normalization & scaling, dimension reduction, and graph-based clustering via the Leiden algorithm.

Code

seu <- PercentageFeatureSet(seu, 
                            pattern = "^mt-", 
                            col.name = "percent_mito", 
                            assay = "spliced") %>% 
       PercentageFeatureSet(pattern = "^Rp[sl]", 
                            col.name = "percent_ribo", 
                            assay = "spliced") %>% 
       NormalizeData(assay = "spliced", verbose = FALSE) %>% 
       NormalizeData(assay = "unspliced", verbose = FALSE) %>% 
       NormalizeData(assay = "RNA", verbose = FALSE) %>% 
       FindVariableFeatures(assay = "spliced", 
                            nfeatures = 3000, 
                            verbose = FALSE) %>% 
       ScaleData(assay = "spliced", 
                 vars.to.regress = c("percent_mito", "percent_ribo"), 
                 model.use = "poisson", 
                 verbose = FALSE) %>% 
       RunPCA(assay = "spliced", 
              npcs = 30, 
              approx = TRUE, 
              seed.use = 312, 
              verbose = FALSE) %>% 
       RunUMAP(reduction = "pca", 
               dims = 1:30, 
               n.components = 2, 
               metric = "cosine", 
               seed.use = 312, 
               verbose = FALSE) %>% 
       FindNeighbors(reduction = "pca", 
                     k.param = 30,
                     nn.method = "annoy", 
                     annoy.metric = "cosine", 
                     verbose = FALSE) %>% 
       FindClusters(algorithm = 4, 
                    method = "igraph", 
                    resolution = 0.5, 
                    random.seed = 312, 
                    verbose = FALSE)

Let’s visualize the results on our UMAP embedding. The clustering generally agrees with the celltype labels, though there is some overclustering in the ductal cells & underclustering in the mature endocrine celltypes.

Code

p0 <- Embeddings(seu, "umap") %>% 
      as.data.frame() %>% 
      magrittr::set_colnames(c("UMAP_1", "UMAP_2")) %>% 
      mutate(leiden = seu$seurat_clusters) %>% 
      ggplot(aes(x = UMAP_1, y = UMAP_2, color = leiden)) + 
      geom_point(size = 1.5, 
                 alpha = 0.75, 
                 stroke = 0) + 
      scale_color_manual(values = palette_cluster) + 
      labs(x = "UMAP 1", 
           y = "UMAP 2", 
           color = "Leiden") + 
      theme_scLANE(umap = TRUE) + 
      guide_umap()
p1 <- Embeddings(seu, "umap") %>% 
      as.data.frame() %>% 
      magrittr::set_colnames(c("UMAP_1", "UMAP_2")) %>% 
      mutate(celltype = seu$celltype) %>% 
      ggplot(aes(x = UMAP_1, y = UMAP_2, color = celltype)) + 
      geom_point(size = 1.5, 
                 alpha = 0.75, 
                 stroke = 0) + 
      scale_color_manual(values = palette_celltype) + 
      labs(x = "UMAP 1", 
           y = "UMAP 2", 
           color = "Celltype") + 
      theme_scLANE(umap = TRUE) + 
      guide_umap()
p2 <- (p0 / p1) +
      plot_layout(guides = "collect", axes = "collect")
p2

Figure 1: Cluster & celltype labels in UMAP space.

5 Trajectory inference

5.1 Pseudotime estimation

We’ll start by fitting a trajectory using the slingshot R package. We define cluster 5 as the starting cluster, since in this case we’re already aware of the dataset’s underlying biology. After generating the estimates for each cell, we rescale the ordering to be defined on \([0, 1]\). This has no effect on the trajectory DE results however, and is mostly an aesthetic choice.

Code

sling_res <- slingshot(Embeddings(seu, "umap"),
                       start.clus = "5",
                       clusterLabels = seu$seurat_clusters, 
                       approx_points = 500)
sling_pt <- slingPseudotime(sling_res) %>% 
            as.data.frame() %>% 
            magrittr::set_colnames(c("PT")) %>% 
            mutate(PT = (PT - min(PT)) / (max(PT) - min(PT)))
seu <- AddMetaData(seu, 
                   metadata = sling_pt, 
                   col.name = "sling_pt")

Let’s visualize the results on our UMAP embedding. They match what we would expect (knowing the biological background of the data), with ductal cells at the start of the process and endocrine celltypes such as alpha, beta, & delta cells at the end of it.

Code

p3 <- Embeddings(seu, "umap") %>% 
      as.data.frame() %>% 
      magrittr::set_colnames(c("UMAP_1", "UMAP_2")) %>% 
      mutate(PT = sling_pt$PT) %>% 
      ggplot(aes(x = UMAP_1, y = UMAP_2, color = PT)) + 
      geom_point(size = 1.5, 
                 alpha = 0.75, 
                 stroke = 0) + 
      labs(x = "UMAP 1", 
           y = "UMAP 2", 
           color = "Pseudotime") + 
      scale_color_gradientn(colors = palette_heatmap, 
                            labels = scales::label_number(accuracy = 0.01)) + 
      theme_scLANE(umap = TRUE)
p4 <- (p3 / p1) + 
      plot_layout(guides = "collect", axes = "collect")
p4

Figure 2: Estimated pseudotime in UMAP space.

5.2 Trajectory DE testing

Next, we prepare the primary inputs to scLANE: our Seurat object with the spliced counts set as the default assay, a dataframe containing our estimated pseudotime ordering, a vector of size factors to use as an offset for sequencing depth in each model, and a set of genes whose dynamics we want to model. scLANE parallelizes over genes in order to speed up the computation at the expense of using a little more memory. The models are fit using NB GLMs with optimal spline knots identified empirically, and differential expression is quantified using a likelihood ratio test of the fitted model vs. a constant (intercept-only) model. In practice, genes designated as HVGs are usually the best candidates for modeling, so we choose the top 3,000 HVGs as our input.

Note

The testing of the HVG set on its own is also justified by the reality that almost all trajectories are inferred using some sort of dimension-reduced space, and those embeddings are nearly universally generated using a set of HVGs. As such, genes not included in the HVG set actually have no direct relationship with the estimated trajectory, & it’s generally safe to exclude them from trajectory analyses.

Code

top3k_hvg <- HVFInfo(seu) %>% 
             arrange(desc(variance.standardized)) %>% 
             slice_head(n = 3000) %>% 
             rownames(.)
cell_offset <- createCellOffset(seu)
scLANE_models <- testDynamic(seu, 
                             pt = sling_pt, 
                             genes = top3k_hvg, 
                             size.factor.offset = cell_offset, 
                             n.cores = 6L, 
                             verbose = FALSE)

Registered S3 method overwritten by 'bit':
  method   from  
  print.ri gamlss

scLANE testing completed for 3000 genes across 1 lineage in 22.681 mins

Code

scLANE_res_tidy <- getResultsDE(scLANE_models)

After tidying the TDE results with getResultsDE(), we pull a sample of 6 genes from the results & display their test statistics. By default, any gene with an adjusted p-value less than 0.01 is predicted to be dynamic, though this threshold can be easily adjusted.

Code

select(scLANE_res_tidy, 
       Gene, 
       Test_Stat, 
       P_Val, 
       P_Val_Adj,
       Gene_Dynamic_Overall) %>% 
  mutate(Gene_Dynamic_Overall = if_else(Gene_Dynamic_Overall == 1, "Dynamic", "Static")) %>% 
  with_groups(Gene_Dynamic_Overall, 
              slice_sample, 
              n = 3) %>% 
  kableExtra::kbl(digits = 4, 
                  booktabs = TRUE, 
                  col.names = c("Gene", "LRT stat.", "P-value", "Adj. p-value", "Predicted gene status")) %>% 
  kableExtra::kable_classic(full_width = FALSE, "hover")

Gene	LRT stat.	P-value	Adj. p-value	Predicted gene status
Prc1	1002.5987	0.0000	0.0000	Dynamic
Ddx3y	34.2841	0.0000	0.0000	Dynamic
Ceacam10	31.5036	0.0000	0.0001	Dynamic
Lims2	2.4696	0.1161	1.0000	Static
Gm9801	NA	NA	NA	Static
Jakmip3	15.8731	0.0004	0.1152	Static

Table 1: TDE test results from scLANE.

Next, we can use the plotModels() function to visualize the fitted models from scLANE and compare them to other modeling methods. The gene Neurog3 is strongly associated with epithelial cell differentiation, and indeed we see a clear, nonlinear transcriptional dynamic across pseudotime for that gene. A traditional GLM fails to capture that nonlinearity, and while a GAM fits the trend smoothly, it seems to overfit the dynamics near the boundaries of pseudotime - a known issue with additive models. Only the scLANE model accurately models the rapid upregulation and equally swift downregulation of the transcription factor neurogenin-3 (Neurog3) over pseudotime, in addition to identifying the cutpoint in pseudotime at which downregulation begins.

Code

p5 <- plotModels(scLANE_models, 
                 gene = "Neurog3", 
                 pt = sling_pt, 
                 expr.mat = seu, 
                 size.factor.offset = cell_offset, 
                 plot.glm = TRUE, 
                 plot.gam = TRUE) + 
      scale_color_manual(values = c("forestgreen"))
p5

Figure 3: Modeling framework comparison.

6 Downstream analysis

6.1 Gene dynamics plots

Using the getFittedValues() function allows us to generate predictions from the models we fit, which we then use to visualize the dynamics of a few genes that are known to be strongly associated with the differentiation of immature progenitors into mature endocrine phenotypes (source 1, source 2). For all four genes, the fitted models show knots chosen in the area of pseudotime around the pre-endocrine cells. This tells us that these driver genes are being upregulated in precursor celltypes & are driving differentiation into the mature celltypes such as alpha & beta cells, after which the genes are downregulated.

Code

p6 <- getFittedValues(scLANE_models, 
                      genes = c("Chga", "Chgb", "Fev", "Cck"), 
                      pt = sling_pt, 
                      expr.mat = seu, 
                      size.factor.offset = cell_offset, 
                      cell.meta.data = select(seu@meta.data, celltype, celltype_coarse)) %>% 
      ggplot(aes(x = pt, y = rna_log1p)) + 
      facet_wrap(~gene, 
                 ncol = 2, 
                 scales = "free_y") + 
      geom_point(aes(color = celltype), 
                 size = 2, 
                 alpha = 0.75, 
                 stroke = 0) + 
      geom_vline(data = data.frame(gene = "Chga", knot = unique(scLANE_models$Chga$Lineage_A$MARGE_Slope_Data$Breakpoint)), 
                 mapping = aes(xintercept = knot), 
                 linetype = "dashed", 
                 color = "grey20") + 
      geom_vline(data = data.frame(gene = "Chgb", knot = unique(scLANE_models$Chgb$Lineage_A$MARGE_Slope_Data$Breakpoint)), 
                 mapping = aes(xintercept = knot), 
                 linetype = "dashed", 
                 color = "grey20") + 
      geom_vline(data = data.frame(gene = "Cck", knot = unique(scLANE_models$Cck$Lineage_A$MARGE_Slope_Data$Breakpoint)), 
                 mapping = aes(xintercept = knot), 
                 linetype = "dashed", 
                 color = "grey20") + 
      geom_vline(data = data.frame(gene = "Fev", knot = unique(scLANE_models$Fev$Lineage_A$MARGE_Slope_Data$Breakpoint)), 
                 mapping = aes(xintercept = knot), 
                 linetype = "dashed", 
                 color = "grey20") + 
      geom_ribbon(aes(ymin = scLANE_ci_ll_log1p, ymax = scLANE_ci_ul_log1p), 
                  linewidth = 0, 
                  fill = "grey70", 
                  alpha = 0.9) + 
      geom_line(aes(y = scLANE_pred_log1p), 
                color = "black", 
                linewidth = 0.75) + 
      scale_x_continuous(labels = scales::label_number(accuracy = 0.01)) + 
      scale_color_manual(values = palette_celltype) + 
      labs(x = "Pseudotime", 
           y = "Normalized Expression") + 
      theme_scLANE() + 
      theme(legend.title = element_blank(), 
            strip.text.x = element_text(face = "italic")) + 
      guide_umap()
p6

Figure 4: scLANE models of endocrinogenesis drivers.

On the other hand, if we use additive models the “peak” of expression is placed among the mature endocrine celltypes - which doesn’t make biological sense if we know that these genes are driving that process of differentiation. This can of course be tweaked by changing the degree or degrees of freedom of the underlying basis spline, but choosing a “best” value for those hyperparameters can be difficult, whereas scLANE identifies optimal parameters internally by default. In addition, the knots chosen by scLANE for each gene can be informative with respect to the underlying biology, whereas the knots from GAMs are evenly spaced at quantiles & carry no biological interpretation.

Code

p7 <- getFittedValues(scLANE_models, 
                      genes = c("Chga", "Chgb", "Fev", "Cck"), 
                      pt = sling_pt, 
                      expr.mat = seu, 
                      size.factor.offset = cell_offset, 
                      cell.meta.data = select(seu@meta.data, celltype, celltype_coarse)) %>% 
      mutate(rna_raw = rna / size_factor, .before = 7) %>% 
      with_groups(gene, 
                  mutate, 
                  GAM_fitted_link = predict(nbGAM(expr = rna_raw, 
                                                  pt = sling_pt, 
                                                  Y.offset = cell_offset, 
                                                  spline.df = 3)), 
                  GAM_se_link = predict(nbGAM(expr = rna_raw, 
                                              pt = sling_pt, 
                                              Y.offset = cell_offset, 
                                              spline.df = 3), se.fit = TRUE)[[2]]) %>% 
      mutate(GAM_pred = exp(GAM_fitted_link) * cell_offset, 
             GAM_ci_ll = exp(GAM_fitted_link - qnorm(0.975) * GAM_se_link) * cell_offset, 
             GAM_ci_ul = exp(GAM_fitted_link + qnorm(0.975) * GAM_se_link) * cell_offset, 
             GAM_pred_log1p = log1p(GAM_pred), 
             GAM_ci_ll_log1p = log1p(GAM_ci_ll), 
             GAM_ci_ul_log1p = log1p(GAM_ci_ul)) %>% 
      ggplot(aes(x = pt, y = rna_log1p)) + 
      facet_wrap(~gene, 
                 ncol = 2, 
                 scales = "free_y") + 
      geom_point(aes(color = celltype), 
                 size = 2, 
                 alpha = 0.75, 
                 stroke = 0) + 
      geom_ribbon(aes(ymin = GAM_ci_ll_log1p, ymax = GAM_ci_ul_log1p), 
                  linewidth = 0, 
                  fill = "grey70", 
                  alpha = 0.9) + 
      geom_line(aes(y = GAM_pred_log1p), 
                color = "black", 
                linewidth = 0.75) + 
      scale_x_continuous(labels = scales::label_number(accuracy = 0.01)) + 
      scale_color_manual(values = palette_celltype) + 
      labs(x = "Pseudotime", 
           y = "Normalized Expression") + 
      theme_scLANE() + 
      theme(legend.title = element_blank(), 
            strip.text.x = element_text(face = "italic")) + 
      guide_umap()
p7

Figure 5: Additive models of endocrinogenesis drivers.

6.2 Distribution of knot locations

Let’s take a broader view of the dataset by examining the distribution of adaptively chosen knots from our models. We limit the analysis to the set of 2452 genes determined to be dynamic.

Code

dyn_genes <- filter(scLANE_res_tidy, Gene_Dynamic_Overall == 1) %>% 
             pull(Gene)
knot_df <- getKnotDist(scLANE_models, dyn_genes)

We’ll plot a histogram of the knot values along with a ridgeplot of the pseudotime distribution for each celltype. We see that a large number of the selected knots are placed at the beginning of the trajectory, around where the ductal cells transition into endocrine progenitors. A smaller set of knots is placed about halfway through the trajectory, which we’ve annotated as the point at which pre-endocrine cells begin differentiating into mature endocrine phenotypes.

Code

p8 <- ggplot(knot_df, aes(x = knot)) + 
      geom_histogram(aes(y = after_stat(density)), 
                     color = "black", 
                     fill = "white", 
                     linewidth = 0.5) + 
      geom_density(fill = "deepskyblue3", 
                   alpha = 0.5, 
                   color = "deepskyblue4", 
                   linewidth = 1) + 
      scale_x_continuous(limits = c(0, 1), labels = scales::label_number(accuracy = 0.01)) + 
      labs(x = "Pseudotime") + 
      theme_scLANE() + 
      theme(axis.ticks.y = element_blank(), 
            axis.title.y = element_blank(), 
            axis.text.y = element_blank())
p9 <- data.frame(celltype = seu$celltype, 
                 pt = seu$sling_pt) %>% 
      ggplot(aes(x = pt, y = celltype, fill = celltype, color = celltype)) + 
      ggridges::geom_density_ridges(alpha = 0.75, size = 1, scale = 0.95) + 
      scale_x_continuous(labels = scales::label_number(accuracy = 0.01), limits = c(0, 1)) + 
      scale_fill_manual(values = palette_celltype) + 
      scale_color_manual(values = palette_celltype) + 
      labs(x = "Pseudotime") + 
      theme_scLANE() + 
      theme(axis.title.y = element_blank(), 
            legend.title = element_blank()) + 
      guide_umap()
p10 <- (p8 / p9) + 
       plot_layout(heights = c(1/4, 3/4), axes = "collect")
p10

Figure 6: Distribution of adaptively-chosen knots from scLANE.

6.3 Dynamic gene clustering

We can extract a matrix of fitted dynamics using smoothedCountsMatrix(). Next, the embedGenes() function reduces dimensionality with PCA, clusters the genes using the Leiden algorithm, & embeds the genes in two dimensions with UMAP.

Code

smoothed_counts <- smoothedCountsMatrix(scLANE_models, 
                                        pt = sling_pt, 
                                        genes = dyn_genes, 
                                        size.factor.offset = cell_offset)
gene_embed <- embedGenes(log1p(smoothed_counts$Lineage_A), resolution.param = 0.2)

First we’ll visualize the gene clusters on the first two PCs.

Code

p11 <- ggplot(gene_embed, aes(x = pc1, y = pc2, color = leiden)) + 
       geom_point(size = 2, 
                  alpha = 0.75, 
                  stroke = 0) + 
       labs(x = "PC 1", 
            y = "PC 2", 
            color = "Leiden Cluster") +
       paletteer::scale_color_paletteer_d("ggsci::default_igv") + 
       theme_scLANE(umap = TRUE) + 
       guide_umap()
p11

Figure 7: Unsupervised clustering of genes in PCA space.

The UMAP embedding shows that even with the relatively small number of genes, clear patterns are visible.

Code

p12 <- ggplot(gene_embed, aes(x = umap1, y = umap2, color = leiden)) + 
       geom_point(size = 2, 
                  alpha = 0.75, 
                  stroke = 0) + 
       labs(x = "UMAP 1", 
            y = "UMAP 2", 
            color = "Gene Cluster") +
       paletteer::scale_color_paletteer_d("ggsci::default_igv") + 
       theme_scLANE(umap = TRUE) + 
       guide_umap()
p12

Figure 8: Unsupervised clustering of genes in UMAP space.

6.4 Expression cascades

We can also plot a heatmap of the dynamic genes; this requires a bit of setup, for which we’ll use the ComplexHeatmap package. We scale each gene, and clip values to be on \([-6, 6]\). The columns (cells) of the heatmap are ordered by estimated pseudotime, and the rows (genes) are ordered by expression peak.

Code

col_anno_df <- select(seu@meta.data, 
                      cell_name, 
                      celltype, 
                      sling_pt) %>% 
               mutate(celltype = as.factor(celltype)) %>% 
               arrange(sling_pt)
gene_order <- sortGenesHeatmap(smoothed_counts$Lineage_A, pt.vec = sling_pt$PT)
heatmap_mat <- t(scale(smoothed_counts$Lineage_A))
heatmap_mat[heatmap_mat > 6] <- 6
heatmap_mat[heatmap_mat < -6] <- -6
colnames(heatmap_mat) <- seu$cell_name
heatmap_mat <- heatmap_mat[, col_anno_df$cell_name]
heatmap_mat <- heatmap_mat[gene_order, ]
palette_celltype_hm <- as.character(palette_celltype[1:length(unique(seu$celltype))])
names(palette_celltype_hm) <- levels(col_anno_df$celltype)
col_anno <- HeatmapAnnotation(Celltype = col_anno_df$celltype, 
                              Pseudotime = col_anno_df$sling_pt, 
                              col = list(Celltype = palette_celltype_hm, 
                                         Pseudotime = circlize::colorRamp2(seq(0, 1, by = 0.25), palette_heatmap)),
                              show_legend = TRUE, 
                              show_annotation_name = FALSE, 
                              gap = unit(1, "mm"), 
                              border = TRUE)
palette_cluster_hm <- as.character(paletteer::paletteer_d("ggsci::default_igv")[1:length(unique(gene_embed$leiden))])
names(palette_cluster_hm) <- as.character(unique(gene_embed$leiden))
row_anno <- HeatmapAnnotation(Cluster = as.factor(gene_embed$leiden), 
                              col = list(Cluster = palette_cluster_hm), 
                              show_legend = TRUE, 
                              show_annotation_name = FALSE, 
                              annotation_legend_param = list(title = "Gene\nCluster"), 
                              gap = unit(1, "mm"), 
                              border = TRUE, 
                              which = "row")

The heatmap shows clear dynamic patterns across pseudotime; these patterns are often referred to as expression cascades, and represent periodic up- and down-regulation of different gene programs during the course of the underlying biological process.

Code

Heatmap(matrix = heatmap_mat, 
        name = "Spliced\nmRNA", 
        col = circlize::colorRamp2(colors = viridis::inferno(50), 
                                   breaks = seq(min(heatmap_mat), max(heatmap_mat), length.out = 50)), 
        cluster_columns = FALSE,
        width = 12, 
        height = 6, 
        column_title = "",
        cluster_rows = FALSE,
        top_annotation = col_anno, 
        left_annotation = row_anno, 
        show_column_names = FALSE, 
        show_row_names = FALSE, 
        use_raster = TRUE,
        raster_by_magick = TRUE, 
        raster_quality = 5)

Figure 9: Expression cascades of dynamic genes.

6.5 Gene programs

Using our gene clusters & the gprofiler2 package, we run an enrichment analysis against the biological process (BP) set of gene ontologies. We make sure to order the genes in each cluster by their test statistics by joining to the results table from scLANE.

Code

gene_clust_list <- purrr::map(unique(gene_embed$leiden), \(x) { 
  filter(gene_embed, leiden == x) %>% 
  inner_join(scLANE_res_tidy, by = c("gene" = "Gene")) %>% 
  arrange(desc(Test_Stat)) %>% 
  pull(gene)
}) 
names(gene_clust_list) <- paste0("Leiden_", unique(gene_embed$leiden))
enrich_res <- gprofiler2::gost(gene_clust_list, 
                               organism = "mmusculus", 
                               ordered_query = TRUE, 
                               multi_query = FALSE, 
                               sources = "GO:BP", 
                               significant = TRUE)

A look at the top 5 most-significant GO terms for each gene cluster reveals heterogeneous functionalities across groups of genes.

Code

mutate(enrich_res$result, 
       query = gsub("Leiden_", "", query)) %>% 
  rename(cluster = query) %>% 
  with_groups(cluster, 
              slice_head,
              n = 5) %>% 
  select(cluster, 
         term_name, 
         p_value, 
         term_size, 
         query_size, 
         intersection_size, 
         term_id) %>% 
  kableExtra::kbl(digits = 3, 
                  booktabs = TRUE, 
                  col.names = c("Gene Cluster", "Term Name", "Adj. P-value", "Term Size", 
                                "Query Size", "Intersection Size", "Term ID")) %>% 
  kableExtra::kable_classic(c("hover"), full_width = FALSE)

Gene Cluster	Term Name	Term Size	Query Size	Intersection Size	Term ID
0	tube development	1181	335	66	GO:0035295
0	anatomical structure development	6269	326	157	GO:0048856
0	multicellular organism development	4879	329	136	GO:0007275
0	developmental process	6854	326	164	GO:0032502
0	tube morphogenesis	927	335	55	GO:0035239
1	signal release	587	178	34	GO:0023061
1	export from cell	982	316	55	GO:0140352
1	secretion by cell	913	289	50	GO:0032940
1	peptide hormone secretion	316	214	28	GO:0030072
1	hormone transport	405	316	36	GO:0009914
2	nitrogen compound transport	1959	160	46	GO:0071705
2	regulation of secretion by cell	667	156	27	GO:1903530
2	regulation of hormone levels	639	156	26	GO:0010817
2	secretion by cell	913	156	30	GO:0032940
2	regulation of transport	1749	274	56	GO:0051049
3	organonitrogen compound metabolic process	5744	523	220	GO:1901564
3	system development	4117	522	173	GO:0048731
3	positive regulation of biological process	6589	528	234	GO:0048518
3	positive regulation of cellular process	6078	528	218	GO:0048522
3	multicellular organism development	4879	523	187	GO:0007275
4	cell cycle	1808	303	173	GO:0007049
4	cell cycle process	1256	285	149	GO:0022402
4	mitotic cell cycle	852	323	125	GO:0000278
4	mitotic cell cycle process	718	285	112	GO:1903047
4	chromosome segregation	408	263	86	GO:0007059
5	multicellular organism development	4879	314	126	GO:0007275
5	system development	4117	314	112	GO:0048731
5	anatomical structure development	6269	307	141	GO:0048856
5	developmental process	6854	341	160	GO:0032502
5	animal organ development	3260	341	99	GO:0048513

Table 2: The top-5 biological process GO terms per cluster.

We can use the geneProgramScoring() function to add module scores for each gene cluster to our Seurat object.

Code

peptide_program <- filter(enrich_res$result, term_name == "peptide hormone secretion") %>% 
                   arrange(p_value) %>% 
                   slice_head(n = 1) %>% 
                   pull(query)
peptide_program_name <- gsub("Leiden_", "cluster_", peptide_program)
seu <- geneProgramScoring(seu, 
                          genes = gene_embed$gene, 
                          gene.clusters = gene_embed$leiden)

Visualizing the scores on our UMAP embedding shows us that the peptide program is highly-enriched only in mature endocrine cells. This makes sense biologically as mature endocrine celltypes’ primary roles are to produce peptides such as glucagon (alpha cells), insulin (beta cells), somatostatin (ductal cells), and pancreatic polypeptide (gamma cells).

Code

p13 <- Embeddings(seu, "umap") %>% 
       as.data.frame() %>% 
       magrittr::set_colnames(c("UMAP_1", "UMAP_2")) %>% 
       mutate(peptide_program_score = seu@meta.data[, peptide_program_name]) %>% 
       ggplot(aes(x = UMAP_1, y = UMAP_2, color = peptide_program_score)) + 
       geom_point(size = 1.5, alpha = 0.75, stroke = 0) + 
       labs(x = "UMAP 1", 
            y = "UMAP 2", 
            color = "Program Score") + 
       scale_color_gradientn(colors = palette_heatmap, 
                             labels = scales::label_number(accuracy = 0.1)) + 
       theme_scLANE(umap = TRUE) 
p14 <- (p13 / p1) + 
       plot_layout(guides = "collect", axes = "collect")
p14

Figure 10: Enrichment of peptide production gene program.

We can also visualize the trend in the peptide program scores over time, which confirms the biological conclusions we came to by inspecting the UMAP.

Code

p15 <- data.frame(PT = sling_pt$PT, 
                  peptide_program_score = seu@meta.data[, peptide_program_name], 
                  celltype = seu$celltype) %>% 
       ggplot(aes(x = PT, y = peptide_program_score, color = celltype)) + 
       geom_point(alpha = 0.75, 
                  stroke = 0, 
                  size = 2) + 
       geom_smooth(color = "black", 
                   method = "loess", 
                   linewidth = 0.75) + 
       scale_color_manual(values = palette_celltype) + 
       labs(x = "Pseudotime", y = "Peptide Program Score") + 
       theme_scLANE() + 
       theme(legend.title = element_blank()) + 
       guide_umap()
p15

Figure 11: Peptide production gene program scores over pseudotime.

Next, in order to identify which genes are “drivers” of a certain gene program, we can use the geneProgramDrivers() function to correlate normalized expression with program scores.

Code

program_drivers <- geneProgramDrivers(seu, 
                                      genes = gene_embed$gene, 
                                      gene.program = seu@meta.data[, peptide_program_name], 
                                      verbose = FALSE)

We display the top 10 most-correlated genes here. For example, carboxypeptidase E (Cpe) is involved in the production of insulin (source).

Code

slice_head(program_drivers, n = 10) %>% 
  kableExtra::kbl(digits = 3, 
                  booktabs = TRUE, 
                  row.names = FALSE,
                  col.names = c("Gene", "Correlation", "P-value", "Adj. P-value")) %>% 
  kableExtra::kable_classic(c("hover"), full_width = FALSE)

Gene	Correlation	P-value	Adj. P-value
Cpe	0.849	0	0
Gnas	0.824	0	0
Fam183b	0.824	0	0
Pcsk1n	0.810	0	0
Sparc	-0.774	0	0
Aplp1	0.767	0	0
Hmgn3	0.757	0	0
Isl1	0.757	0	0
1700086L19Rik	0.755	0	0
Spp1	-0.752	0	0

Table 3: The top-10 driver genes for the peptide production gene program.

Indeed, normalized expression of Cpe is high across all mature endocrine celltypes, with alpha and beta cells showing the highest overall mean expression.

Code

p16 <- data.frame(celltype = seu$celltype, 
                  rna = seu@assays$spliced@data["Cpe", ]) %>% 
       ggplot(aes(x = celltype, y = rna, color = celltype)) + 
       ggbeeswarm::geom_quasirandom(alpha = 0.75, 
                                    size = 2, 
                                    stroke = 0, 
                                    show.legend = FALSE) +
       scale_color_manual(values = palette_celltype) + 
       stat_summary(fun = "mean", 
                    geom = "point", 
                    color = "black",
                    size = 3) + 
       labs(y = "Normalized Expression") + 
       theme_scLANE() + 
       theme(axis.title.x = element_blank())
p16

Figure 12: Beeswarm plots of driver gene expression by celltype.

6.6 Dynamic gene enrichment

Lastly, we perform an enrichment analysis for our set of dynamic genes. We’ll focus on terms from the GO biological process (BP) set, as those are generally the easiest to interpret.

Code

dyn_gene_enrich <- enrichDynamicGenes(scLANE_res_tidy, species = "Mmusculus")
dyn_go_bp_terms <- filter(dyn_gene_enrich$result, 
                          source == "GO:BP", 
                          p_value < 0.05)

Overall, there are 1341 unique significantly-enriched GO:BP terms for our trajectory.

Code

select(dyn_go_bp_terms, term_name, p_value, source) %>% 
  slice_sample(n = 10) %>% 
  kableExtra::kbl(digits = 3, 
                  booktabs = TRUE, 
                  row.names = FALSE,
                  col.names = c("Term", "P-value", "Source")) %>% 
  kableExtra::kable_classic(c("hover"), full_width = FALSE)

Term	P-value	Source
multi-organism reproductive process	0.000	GO:BP
response to electrical stimulus	0.006	GO:BP
regulation of nuclear division	0.000	GO:BP
meiotic nuclear division	0.000	GO:BP
response to transforming growth factor beta	0.000	GO:BP
regulation of fibroblast proliferation	0.000	GO:BP
cellular response to carbohydrate stimulus	0.000	GO:BP
double-strand break repair via homologous recombination	0.018	GO:BP
carbohydrate homeostasis	0.000	GO:BP
positive regulation of calcium ion transport	0.046	GO:BP

Table 4: Random sample of the biological processes enriched for the dynamic gene set.

7 Conclusions

Hopefully this vignette has been a useful introduction to running the scLANE software and using its outputs to help better understand biology at single-cell resolution. If you have questions about how the models work or are interpreted, software issues, or simply want to compare results feel free to open an issue on the GitHub repository or reach out via email to j.leary@ufl.edu.

8 References

Bastidas-Ponce, Aimée et al. Comprehensive single cell mRNA profiling reveals a detailed roadmap for pancreatic endocrinogenesis. Development (2019).
Street, Kelly et al. Slingshot: cell lineage and pseudotime inference for single-cell transcriptomics. BMC Genomics (2018).
Stoklosa, Jakub & David Warton. A generalized estimating equation approach to multivariate adaptive regression splines. Journal of Computational and Graphical Statistics (2018).

9 Session Info

Code

sessioninfo::session_info()

─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.2 (2023-10-31)
 os       macOS Sonoma 14.3.1
 system   x86_64, darwin20
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       America/New_York
 date     2024-04-16
 pandoc   3.1.12.3 @ /usr/local/bin/ (via rmarkdown)

─ Packages ───────────────────────────────────────────────────────────────────
 package              * version    date (UTC) lib source
 abind                  1.4-5      2016-07-21 [1] CRAN (R 4.3.0)
 backports              1.4.1      2021-12-13 [1] CRAN (R 4.3.0)
 beeswarm               0.4.0      2021-06-01 [1] CRAN (R 4.3.0)
 bigassertr             0.1.6      2023-01-10 [1] CRAN (R 4.3.0)
 bigparallelr           0.3.2      2021-10-02 [1] CRAN (R 4.3.0)
 bigstatsr              1.5.12     2022-10-14 [1] CRAN (R 4.3.0)
 Biobase              * 2.62.0     2023-10-24 [1] Bioconductor
 BiocGenerics         * 0.48.1     2023-11-01 [1] Bioconductor
 BiocNeighbors          1.20.1     2023-12-18 [1] Bioconductor 3.18 (R 4.3.2)
 BiocParallel           1.36.0     2023-10-24 [1] Bioconductor
 bit                    4.0.5      2022-11-15 [1] CRAN (R 4.3.0)
 bitops                 1.0-7      2021-04-24 [1] CRAN (R 4.3.0)
 bluster                1.12.0     2023-10-24 [1] Bioconductor
 boot                   1.3-28.1   2022-11-22 [1] CRAN (R 4.3.2)
 broom                  1.0.5      2023-06-09 [1] CRAN (R 4.3.0)
 broom.mixed            0.2.9.4    2022-04-17 [1] CRAN (R 4.3.0)
 circlize               0.4.15     2022-05-10 [1] CRAN (R 4.3.0)
 cli                    3.6.2      2023-12-11 [1] CRAN (R 4.3.0)
 clue                   0.3-65     2023-09-23 [1] CRAN (R 4.3.0)
 cluster                2.1.6      2023-12-01 [1] CRAN (R 4.3.0)
 coda                   0.19-4     2020-09-30 [1] CRAN (R 4.3.0)
 codetools              0.2-19     2023-02-01 [1] CRAN (R 4.3.2)
 colorspace             2.1-0      2023-01-23 [1] CRAN (R 4.3.0)
 ComplexHeatmap       * 2.18.0     2023-10-24 [1] Bioconductor
 cowplot                1.1.2      2023-12-15 [1] CRAN (R 4.3.0)
 crayon                 1.5.2      2022-09-29 [1] CRAN (R 4.3.0)
 data.table             1.14.10    2023-12-08 [1] CRAN (R 4.3.0)
 DelayedArray           0.28.0     2023-10-24 [1] Bioconductor
 DelayedMatrixStats     1.24.0     2023-10-24 [1] Bioconductor
 deldir                 2.0-2      2023-11-23 [1] CRAN (R 4.3.0)
 digest                 0.6.33     2023-07-07 [1] CRAN (R 4.3.0)
 doParallel             1.0.17     2022-02-07 [1] CRAN (R 4.3.0)
 doSNOW                 1.0.20     2022-02-04 [1] CRAN (R 4.3.0)
 dotCall64              1.1-1      2023-11-28 [1] CRAN (R 4.3.0)
 dplyr                * 1.1.4      2023-11-17 [1] CRAN (R 4.3.0)
 ellipsis               0.3.2      2021-04-29 [1] CRAN (R 4.3.0)
 emmeans                1.10.0     2024-01-23 [1] CRAN (R 4.3.2)
 estimability           1.4.1      2022-08-05 [1] CRAN (R 4.3.0)
 evaluate               0.23       2023-11-01 [1] CRAN (R 4.3.0)
 fansi                  1.0.6      2023-12-08 [1] CRAN (R 4.3.0)
 farver                 2.1.1      2022-07-06 [1] CRAN (R 4.3.0)
 fastDummies            1.7.3      2023-07-06 [1] CRAN (R 4.3.0)
 fastmap                1.1.1      2023-02-24 [1] CRAN (R 4.3.0)
 ff                     4.0.9      2023-01-25 [1] CRAN (R 4.3.0)
 fitdistrplus           1.1-11     2023-04-25 [1] CRAN (R 4.3.0)
 flock                  0.7        2016-11-12 [1] CRAN (R 4.3.0)
 forcats                1.0.0      2023-01-29 [1] CRAN (R 4.3.0)
 foreach                1.5.2      2022-02-02 [1] CRAN (R 4.3.0)
 furrr                  0.3.1      2022-08-15 [1] CRAN (R 4.3.0)
 future                 1.33.1     2023-12-22 [1] CRAN (R 4.3.0)
 future.apply           1.11.1     2023-12-21 [1] CRAN (R 4.3.0)
 gamlss                 5.4-20     2023-10-04 [1] CRAN (R 4.3.0)
 gamlss.data            6.0-2      2021-11-07 [1] CRAN (R 4.3.0)
 gamlss.dist            6.1-1      2023-08-23 [1] CRAN (R 4.3.0)
 geeM                   0.10.1     2018-06-18 [1] CRAN (R 4.3.0)
 generics               0.1.3      2022-07-05 [1] CRAN (R 4.3.0)
 GenomeInfoDb         * 1.38.5     2023-12-28 [1] Bioconductor 3.18 (R 4.3.2)
 GenomeInfoDbData       1.2.11     2023-12-22 [1] Bioconductor
 GenomicRanges        * 1.54.1     2023-10-29 [1] Bioconductor
 GetoptLong             1.0.5      2020-12-15 [1] CRAN (R 4.3.0)
 ggbeeswarm             0.7.2      2023-04-29 [1] CRAN (R 4.3.0)
 ggh4x                  0.2.8      2024-01-23 [1] CRAN (R 4.3.2)
 ggplot2              * 3.4.4      2023-10-12 [1] CRAN (R 4.3.0)
 ggrepel                0.9.5      2024-01-10 [1] CRAN (R 4.3.0)
 ggridges               0.5.5      2023-12-15 [1] CRAN (R 4.3.0)
 glm2                 * 1.2.1      2018-08-11 [1] CRAN (R 4.3.0)
 glmmTMB                1.1.8      2023-10-07 [1] CRAN (R 4.3.0)
 GlobalOptions          0.1.2      2020-06-10 [1] CRAN (R 4.3.0)
 globals                0.16.2     2022-11-21 [1] CRAN (R 4.3.0)
 glue                   1.6.2      2022-02-24 [1] CRAN (R 4.3.0)
 goftest                1.2-3      2021-10-07 [1] CRAN (R 4.3.0)
 gprofiler2             0.2.2      2023-06-14 [1] CRAN (R 4.3.0)
 gridExtra              2.3        2017-09-09 [1] CRAN (R 4.3.0)
 gtable                 0.3.4      2023-08-21 [1] CRAN (R 4.3.0)
 highr                  0.10       2022-12-22 [1] CRAN (R 4.3.0)
 htmltools              0.5.7      2023-11-03 [1] CRAN (R 4.3.0)
 htmlwidgets            1.6.4      2023-12-06 [1] CRAN (R 4.3.0)
 httpuv                 1.6.13     2023-12-06 [1] CRAN (R 4.3.0)
 httr                   1.4.7      2023-08-15 [1] CRAN (R 4.3.0)
 ica                    1.0-3      2022-07-08 [1] CRAN (R 4.3.0)
 igraph                 2.0.1.1    2024-01-30 [1] CRAN (R 4.3.2)
 IRanges              * 2.36.0     2023-10-24 [1] Bioconductor
 irlba                  2.3.5.1    2022-10-03 [1] CRAN (R 4.3.0)
 iterators              1.0.14     2022-02-05 [1] CRAN (R 4.3.0)
 jsonlite               1.8.8      2023-12-04 [1] CRAN (R 4.3.0)
 kableExtra             1.3.4      2021-02-20 [1] CRAN (R 4.3.0)
 KernSmooth             2.23-22    2023-07-10 [1] CRAN (R 4.3.2)
 knitr                  1.45       2023-10-30 [1] CRAN (R 4.3.0)
 labeling               0.4.3      2023-08-29 [1] CRAN (R 4.3.0)
 later                  1.3.2      2023-12-06 [1] CRAN (R 4.3.0)
 lattice                0.22-5     2023-10-24 [1] CRAN (R 4.3.0)
 lazyeval               0.2.2      2019-03-15 [1] CRAN (R 4.3.0)
 leiden                 0.4.3.1    2023-11-17 [1] CRAN (R 4.3.0)
 lifecycle              1.0.4      2023-11-07 [1] CRAN (R 4.3.0)
 listenv                0.9.0      2022-12-16 [1] CRAN (R 4.3.0)
 lme4                   1.1-35.1   2023-11-05 [1] CRAN (R 4.3.0)
 lmtest                 0.9-40     2022-03-21 [1] CRAN (R 4.3.0)
 magick                 2.8.2      2023-12-20 [1] CRAN (R 4.3.0)
 magrittr             * 2.0.3      2022-03-30 [1] CRAN (R 4.3.0)
 MASS                   7.3-60     2023-05-04 [1] CRAN (R 4.3.0)
 Matrix                 1.6-5      2024-01-11 [1] CRAN (R 4.3.0)
 MatrixGenerics       * 1.14.0     2023-10-24 [1] Bioconductor
 matrixStats          * 1.2.0      2023-12-11 [1] CRAN (R 4.3.0)
 mgcv                   1.9-1      2023-12-21 [1] CRAN (R 4.3.0)
 mime                   0.12       2021-09-28 [1] CRAN (R 4.3.0)
 miniUI                 0.1.1.1    2018-05-18 [1] CRAN (R 4.3.0)
 minqa                  1.2.6      2023-09-11 [1] CRAN (R 4.3.0)
 munsell                0.5.0      2018-06-12 [1] CRAN (R 4.3.0)
 mvtnorm                1.2-4      2023-11-27 [1] CRAN (R 4.3.0)
 nlme                   3.1-164    2023-11-27 [1] CRAN (R 4.3.0)
 nloptr                 2.0.3      2022-05-26 [1] CRAN (R 4.3.0)
 numDeriv               2016.8-1.1 2019-06-06 [1] CRAN (R 4.3.0)
 paletteer              1.6.0      2024-01-21 [1] CRAN (R 4.3.0)
 parallelly             1.36.0     2023-05-26 [1] CRAN (R 4.3.0)
 patchwork            * 1.2.0      2024-01-08 [1] CRAN (R 4.3.0)
 pbapply                1.7-2      2023-06-27 [1] CRAN (R 4.3.0)
 pillar                 1.9.0      2023-03-22 [1] CRAN (R 4.3.0)
 pkgconfig              2.0.3      2019-09-22 [1] CRAN (R 4.3.0)
 plotly                 4.10.4     2024-01-13 [1] CRAN (R 4.3.0)
 plyr                   1.8.9      2023-10-02 [1] CRAN (R 4.3.0)
 png                    0.1-8      2022-11-29 [1] CRAN (R 4.3.0)
 polyclip               1.10-6     2023-09-27 [1] CRAN (R 4.3.0)
 princurve            * 2.1.6      2021-01-18 [1] CRAN (R 4.3.0)
 prismatic              1.1.1      2022-08-15 [1] CRAN (R 4.3.0)
 progressr              0.14.0     2023-08-10 [1] CRAN (R 4.3.0)
 promises               1.2.1      2023-08-10 [1] CRAN (R 4.3.0)
 ps                     1.7.5      2023-04-18 [1] CRAN (R 4.3.0)
 purrr                  1.0.2      2023-08-10 [1] CRAN (R 4.3.0)
 R6                     2.5.1      2021-08-19 [1] CRAN (R 4.3.0)
 RANN                   2.6.1      2019-01-08 [1] CRAN (R 4.3.0)
 RColorBrewer           1.1-3      2022-04-03 [1] CRAN (R 4.3.0)
 Rcpp                   1.0.11     2023-07-06 [1] CRAN (R 4.3.0)
 RcppAnnoy              0.0.21     2023-07-02 [1] CRAN (R 4.3.0)
 RcppEigen              0.3.3.9.4  2023-11-02 [1] CRAN (R 4.3.0)
 RcppHNSW               0.5.0      2023-09-19 [1] CRAN (R 4.3.0)
 RCurl                  1.98-1.13  2023-11-02 [1] CRAN (R 4.3.0)
 rematch2               2.1.2      2020-05-01 [1] CRAN (R 4.3.0)
 reshape2               1.4.4      2020-04-09 [1] CRAN (R 4.3.0)
 reticulate           * 1.34.0     2023-10-12 [1] CRAN (R 4.3.0)
 rjson                  0.2.21     2022-01-09 [1] CRAN (R 4.3.0)
 rlang                  1.1.2      2023-11-04 [1] CRAN (R 4.3.0)
 rmarkdown              2.25       2023-09-18 [1] CRAN (R 4.3.0)
 rmio                   0.4.0      2022-02-17 [1] CRAN (R 4.3.0)
 ROCR                   1.0-11     2020-05-02 [1] CRAN (R 4.3.0)
 RSpectra               0.16-1     2022-04-24 [1] CRAN (R 4.3.0)
 rstudioapi             0.15.0     2023-07-07 [1] CRAN (R 4.3.0)
 Rtsne                  0.17       2023-12-07 [1] CRAN (R 4.3.0)
 rvest                  1.0.3      2022-08-19 [1] CRAN (R 4.3.0)
 S4Arrays               1.2.0      2023-10-24 [1] Bioconductor
 S4Vectors            * 0.40.2     2023-11-23 [1] Bioconductor
 scales                 1.3.0      2023-11-28 [1] CRAN (R 4.3.0)
 scattermore            1.2        2023-06-12 [1] CRAN (R 4.3.0)
 scLANE               * 0.7.9      2024-04-13 [1] Bioconductor
 sctransform            0.4.1      2023-10-19 [1] CRAN (R 4.3.0)
 sessioninfo            1.2.2      2021-12-06 [1] CRAN (R 4.3.0)
 Seurat               * 5.0.1      2023-11-17 [1] CRAN (R 4.3.0)
 SeuratObject         * 5.0.1      2023-11-17 [1] CRAN (R 4.3.0)
 shape                  1.4.6      2021-05-19 [1] CRAN (R 4.3.0)
 shiny                  1.8.0      2023-11-17 [1] CRAN (R 4.3.0)
 SingleCellExperiment * 1.24.0     2023-10-24 [1] Bioconductor
 slingshot            * 2.10.0     2023-10-24 [1] Bioconductor
 snow                   0.4-4      2021-10-27 [1] CRAN (R 4.3.0)
 sp                   * 2.1-2      2023-11-26 [1] CRAN (R 4.3.0)
 spam                   2.10-0     2023-10-23 [1] CRAN (R 4.3.0)
 SparseArray            1.2.3      2023-12-25 [1] Bioconductor 3.18 (R 4.3.2)
 sparseMatrixStats      1.14.0     2023-10-24 [1] Bioconductor
 spatstat.data          3.0-3      2023-10-24 [1] CRAN (R 4.3.0)
 spatstat.explore       3.2-5      2023-10-22 [1] CRAN (R 4.3.0)
 spatstat.geom          3.2-7      2023-10-20 [1] CRAN (R 4.3.0)
 spatstat.random        3.2-2      2023-11-29 [1] CRAN (R 4.3.0)
 spatstat.sparse        3.0-3      2023-10-24 [1] CRAN (R 4.3.0)
 spatstat.utils         3.0-4      2023-10-24 [1] CRAN (R 4.3.0)
 stringi                1.8.3      2023-12-11 [1] CRAN (R 4.3.0)
 stringr                1.5.1      2023-11-14 [1] CRAN (R 4.3.0)
 SummarizedExperiment * 1.32.0     2023-10-24 [1] Bioconductor
 survival               3.5-7      2023-08-14 [1] CRAN (R 4.3.2)
 svglite                2.1.3      2023-12-08 [1] CRAN (R 4.3.0)
 systemfonts            1.0.5      2023-10-09 [1] CRAN (R 4.3.0)
 tensor                 1.5        2012-05-05 [1] CRAN (R 4.3.0)
 tibble                 3.2.1      2023-03-20 [1] CRAN (R 4.3.0)
 tidyr                  1.3.1      2024-01-24 [1] CRAN (R 4.3.2)
 tidyselect             1.2.0      2022-10-10 [1] CRAN (R 4.3.0)
 TMB                    1.9.11     2024-04-03 [1] CRAN (R 4.3.2)
 TrajectoryUtils      * 1.10.0     2023-10-24 [1] Bioconductor
 UCell                  2.6.2      2023-11-06 [1] Bioconductor
 utf8                   1.2.4      2023-10-22 [1] CRAN (R 4.3.0)
 uwot                   0.1.16     2023-06-29 [1] CRAN (R 4.3.0)
 vctrs                  0.6.5      2023-12-01 [1] CRAN (R 4.3.0)
 vipor                  0.4.7      2023-12-18 [1] CRAN (R 4.3.0)
 viridis                0.6.4      2023-07-22 [1] CRAN (R 4.3.0)
 viridisLite            0.4.2      2023-05-02 [1] CRAN (R 4.3.0)
 webshot                0.5.5      2023-06-26 [1] CRAN (R 4.3.0)
 withr                  2.5.2      2023-10-30 [1] CRAN (R 4.3.0)
 xfun                   0.41       2023-11-01 [1] CRAN (R 4.3.0)
 xml2                   1.3.6      2023-12-04 [1] CRAN (R 4.3.0)
 xtable                 1.8-4      2019-04-21 [1] CRAN (R 4.3.0)
 XVector                0.42.0     2023-10-24 [1] Bioconductor
 yaml                   2.3.8      2023-12-11 [1] CRAN (R 4.3.0)
 zlibbioc               1.48.0     2023-10-24 [1] Bioconductor
 zoo                    1.8-12     2023-04-13 [1] CRAN (R 4.3.0)

 [1] /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/library

─ Python configuration ───────────────────────────────────────────────────────
 python:         /Users/jack/Desktop/PhD/Research/Python_Envs/personal_site/bin/python
 libpython:      /usr/local/opt/python@3.11/Frameworks/Python.framework/Versions/3.11/lib/python3.11/config-3.11-darwin/libpython3.11.dylib
 pythonhome:     /Users/jack/Desktop/PhD/Research/Python_Envs/personal_site:/Users/jack/Desktop/PhD/Research/Python_Envs/personal_site
 version:        3.11.6 (main, Nov  2 2023, 04:52:24) [Clang 14.0.3 (clang-1403.0.22.14.1)]
 numpy:          /Users/jack/Desktop/PhD/Research/Python_Envs/personal_site/lib/python3.11/site-packages/numpy
 numpy_version:  1.26.3
 
 NOTE: Python version was forced by use_python() function

──────────────────────────────────────────────────────────────────────────────