Investigating Transcription Factor Activity During Neurogenesis

Author

Affiliation

Jack R. Leary

University of Florida

Published

April 16, 2024

1 Introduction

2 Libraries

Code

library(dplyr)                 # data manipulation
library(Seurat)                # scRNA-seq tools
library(scLANE)                # trajectory DE
library(ggplot2)               # pretty plots
library(slingshot)             # pseudotime estimation
library(patchwork)             # plot alignment
library(SingleCellExperiment)  # scRNA-seq data structures

3 Color palettes

Code

palette_celltype <- paletteer::paletteer_d("ggsci::category10_d3")
palette_cluster <- paletteer::paletteer_d("ggsci::nrc_npg")
palette_week <- paletteer::paletteer_d("ggsci::default_igv")
palette_heatmap <- paletteer::paletteer_d("MetBrewer::Hiroshige", direction = -1)

4 Data

We begin by reading in the data from Zhong et al (2018), which contains 2,394 cells from developing human prefrontal cortex (PFC) samples. The counts and metadata are handily available in the scRNAseq package.

Code

sce_brain <- scRNAseq::ZhongPrefrontalData()
seu_brain <- CreateSeuratObject(counts(sce_brain), 
                                assay = "RNA", 
                                meta.data = as.data.frame(colData(sce_brain)), 
                                project = "neurogenesis", 
                                min.cells = 0,
                                min.features = 0)

5 Preprocessing

First, we’ll need to integrate expression across the 39 samples. We’ll do this using the new Layers functionality implemented in Seurat v5 (vignette).

Code

seu_brain[["RNA"]] <- split(seu_brain[["RNA"]], f = seu_brain$sample)

Next, we normalize the counts, identify highly variable genes (HVGs), and run PCA - all of these are required prior to integration.

Code

seu_brain <- NormalizeData(seu_brain, verbose = FALSE) %>% 
             FindVariableFeatures(nfeatures = 3000, verbose = FALSE) %>% 
             ScaleData(verbose = FALSE) %>% 
             RunPCA(npcs = 50, 
                    verbose = FALSE, 
                    seed.use = 312, 
                    approx = TRUE)

We use canonical correlation analysis (CCA) integration to harmonize our samples. It’s necessary to tweak a few of the nearest-neighbor-related parameters since some of our samples have few cells.

Code

seu_brain <- IntegrateLayers(seu_brain, 
                             method = CCAIntegration, 
                             orig.reduction = "pca", 
                             new.reduction = "integrated.cca", 
                             k.weight = 10, 
                             k.anchor = 10, 
                             k.score = 10, 
                             k.filter = 10, 
                             dims = 1:10, 
                             nn.method = "annoy", 
                             normalization.method = "LogNormalize", 
                             verbose = FALSE)

We perform dimension reduction with UMAP, embed the cells in a shared nearest-neighbor (SNN) graph, and partition the graph into clusters via the Leiden algorithm. We also identify subclusters in two of the larger main clusters.

Code

seu_brain <- RunUMAP(seu_brain, 
                     dims = 1:30, 
                     reduction = "integrated.cca", 
                     return.model = TRUE, 
                     n.neighbors = 30, 
                     n.components = 2, 
                     metric = "cosine", 
                     seed.use = 312, 
                     verbose = FALSE) %>% 
             FindNeighbors(reduction = "integrated.cca", 
                           dims = 1:30, 
                           k.param = 30, 
                           nn.method = "annoy", 
                           annoy.metric = "cosine",
                           verbose = FALSE) %>% 
             FindClusters(resolution = 0.6, 
                          random.seed = 312, 
                          algorithm = 4, 
                          method = "igraph",
                          verbose = FALSE) %>% 
             FindSubCluster(cluster = "5", 
                            graph.name = "RNA_snn", 
                            subcluster.name = "seurat_subclusters", 
                            resolution = 0.25, 
                            algorithm = 4)
seu_brain@meta.data <- mutate(seu_brain@meta.data, 
                              seurat_clusters = as.factor(as.integer(as.factor(seurat_subclusters))))
Idents(seu_brain) <- "seurat_clusters"
seu_brain <- FindSubCluster(seu_brain, 
                            cluster = "4", 
                            graph.name = "RNA_snn", 
                            subcluster.name = "seurat_subclusters", 
                            resolution = 0.25, 
                            algorithm = 4)
seu_brain@meta.data <- mutate(seu_brain@meta.data, 
                              seurat_clusters = as.factor(as.integer(as.factor(seurat_subclusters))))
Idents(seu_brain) <- "seurat_clusters"

Our 9 clusters appear to be relatively cleanly-separated on our UMAP embedding.

Code

p1 <- data.frame(Embeddings(seu_brain, "umap")) %>% 
      mutate(seurat_clusters = seu_brain$seurat_clusters) %>% 
      ggplot(aes(x = umap_1, y = umap_2, color = seurat_clusters)) + 
      geom_point(size = 1.5, 
                 alpha = 0.75, 
                 stroke = 0) + 
      scale_color_manual(values = palette_cluster) + 
      labs(x = "UMAP 1", 
           y = "UMAP 2", 
           color = "Leiden Cluster") + 
      theme_scLANE(umap = TRUE) + 
      guides(color = guide_legend(override.aes = list(size = 4, alpha = 1, stroke = 0.25)))
p1

Figure 1: UMAP embedding colored by cluster ID.

6 Analysis

6.1 Celltype annotation

Using a Wilcox test, we perform a basic DE analysis in order to identify putative marker genes for each cluster.

Code

seu_brain <- JoinLayers(seu_brain)
possible_markers <- FindAllMarkers(seu_brain, 
                                   assay = "RNA", 
                                   logfc.threshold = 0.3, 
                                   test.use = "wilcox", 
                                   slot = "data", 
                                   min.pct = 0.1, 
                                   only.pos = TRUE, 
                                   verbose = FALSE, 
                                   random.seed = 312)
top5_possible_markers <- arrange(possible_markers, p_val_adj) %>% 
                         with_groups(cluster, 
                                     slice_head, 
                                     n = 5)

We visualize the top 5 markers per cluster using a dotplot. Clear expression patterns emerge - for example, expression of NK2 homeobox 2 (NKX2-2) is clearly specific to cluster 9.

Code

p2 <- DotPlot(seu_brain, 
              features = unique(top5_possible_markers$gene),
              assay = "RNA", 
              group.by = "seurat_clusters", 
              dot.scale = 5,
              cols = paletteer::paletteer_d("wesanderson::Zissou1")[c(1, 5)], 
              scale.by = "radius") + 
      coord_flip() +
      labs(y = "Leiden Cluster") +
      theme_scLANE() + 
      theme(axis.title.y = element_blank(), 
            axis.text.y = element_text(face = "italic"))
p2

Figure 2: Putative marker genes for each cluster.

Based on the clustering and DE analysis, we assign celltype labels to each cluster.

Note

The celltype annotations we’re assigning are mostly based on Figs. 1-3 and Extended Data Figs. 1-2 & 4 from the original publication.

Code

seu_brain@meta.data <- mutate(seu_brain@meta.data, 
                              celltype = case_when(seurat_clusters == "1" ~ "Excitatory Neuron", 
                                                   seurat_clusters == "2" ~ "Interneuron", 
                                                   seurat_clusters == "3" ~ "Excitatory Neuron", 
                                                   seurat_clusters == "4" ~ "Neuronal Progenitor", 
                                                   seurat_clusters == "5" ~ "Astrocyte", 
                                                   seurat_clusters == "6" ~ "Neuronal Progenitor", 
                                                   seurat_clusters == "7" ~ "Microglia", 
                                                   seurat_clusters == "8" ~ "Excitatory Neuron", 
                                                   seurat_clusters == "9" ~ "Oligodendrocyte Progenitor", 
                                                   TRUE ~ NA_character_), 
                              celltype_short = case_when(seurat_clusters == "1" ~ "Exc. Neuron", 
                                                         seurat_clusters == "2" ~ "Interneuron", 
                                                         seurat_clusters == "3" ~ "Exc. Neuron", 
                                                         seurat_clusters == "4" ~ "NPC", 
                                                         seurat_clusters == "5" ~ "Astrocyte", 
                                                         seurat_clusters == "6" ~ "NPC", 
                                                         seurat_clusters == "7" ~ "Microglia", 
                                                         seurat_clusters == "8" ~ "Exc. Neuron", 
                                                         seurat_clusters == "9" ~ "OPC", 
                                                         TRUE ~ NA_character_)) %>% 
                              mutate(celltype_short = as.factor(celltype_short))

We plot our new celltype labels on our UMAP embedding.

Code

p3 <- data.frame(Embeddings(seu_brain, "umap")) %>% 
      mutate(celltype = seu_brain$celltype) %>% 
      ggplot(aes(x = umap_1, y = umap_2, color = celltype)) + 
      geom_point(size = 1.5, 
                 alpha = 0.75, 
                 stroke = 0) + 
      scale_color_manual(values = palette_celltype) + 
      labs(x = "UMAP 1", y = "UMAP 2") + 
      theme_scLANE(umap = TRUE) + 
      theme(legend.title = element_blank()) + 
      guides(color = guide_legend(override.aes = list(size = 4, alpha = 1, stroke = 0.25)))
p3

Figure 3: UMAP embedding colored by celltype.

We re-run our DE testing between our annotated celltypes.

Code

Idents(seu_brain) <- "celltype_short"
celltype_markers <- FindAllMarkers(seu_brain, 
                                   assay = "RNA", 
                                   logfc.threshold = 0.3, 
                                   test.use = "wilcox", 
                                   slot = "data", 
                                   min.pct = 0.1, 
                                   only.pos = TRUE, 
                                   verbose = FALSE, 
                                   random.seed = 312) %>% 
                    mutate(cluster = factor(cluster, levels = levels(seu_brain$celltype_short)))
top5_celltype_markers <- arrange(celltype_markers, p_val_adj) %>% 
                         with_groups(cluster, 
                                     slice_head, 
                                     n = 5)

The expression trends are clear when visualized.

Code

p4 <- DotPlot(seu_brain, 
              features = unique(top5_celltype_markers$gene),
              assay = "RNA", 
              group.by = "celltype_short", 
              dot.scale = 5,
              cols = paletteer::paletteer_d("wesanderson::Zissou1")[c(1, 5)], 
              scale.by = "radius") + 
      coord_flip() +
      theme_scLANE() + 
      theme(axis.title = element_blank(), 
            axis.text.y = element_text(face = "italic"), 
            axis.text.x = element_text(angle = 45, hjust = 1, vjust = 1))
p4

Figure 4: Putative marker genes for each celltype.

6.2 Trajectory inference

As performed by the original authors, we filter out microglial and interneuron cells. This is done because “microglia are mesoderm-derived cells and interneurons are generated in the ganglionic eminence and migrate tangentially to the PFC” i.e., they do not belong to neuronal or glial lineages we’re interested in.

Code

seu_brain <- seu_brain[, !seu_brain$celltype %in% c("Interneuron", "Microglia")]

Since we’ve subset the data, we now need to re-embed the cells using UMAP.

Code

seu_brain <- RunUMAP(seu_brain, 
                     dims = 1:30, 
                     reduction = "integrated.cca", 
                     return.model = TRUE, 
                     n.neighbors = 30, 
                     n.components = 2, 
                     metric = "cosine", 
                     seed.use = 312, 
                     verbose = FALSE)

We can already roughly see how the trajectory will be fit - the root population is the neuronal progenitors, with the other three celltypes being terminal cell fates. As such, we should aim to fit three lineages during trajectory estimation.

Code

p5 <- data.frame(Embeddings(seu_brain, "umap")) %>% 
      mutate(celltype = seu_brain$celltype) %>% 
      ggplot(aes(x = umap_1, y = umap_2, color = celltype)) + 
      geom_point(size = 1.5, 
                 alpha = 0.75, 
                 stroke = 0) + 
      scale_color_manual(values = palette_celltype) + 
      labs(x = "UMAP 1", y = "UMAP 2") + 
      theme_scLANE(umap = TRUE) + 
      theme(legend.title = element_blank()) + 
      guides(color = guide_legend(override.aes = list(size = 4, alpha = 1, stroke = 0.25)))
p5

Figure 5: UMAP embedding colored by filtered celltypes.

6.2.1 Pseudotime estimation

We’ll use the Slingshot package to fit principal curves through our UMAP embedding & estimate lineage-specific pseudotime.

Code

sling_res <- slingshot(Embeddings(seu_brain, "umap"), 
                       clusterLabels = seu_brain$celltype, 
                       start.clus = "Neuronal Progenitor", 
                       end.clus = c("Astrocyte", "Oligodendrocyte Progenitor", "Excitatory Neuron"), 
                       approx_points = 1000)
sling_curves <- slingCurves(sling_res, as.df = TRUE)
sling_mst <- slingMST(sling_res, as.df = TRUE)
sling_pt <- slingPseudotime(sling_res) %>% 
            as.data.frame() %>%
            rowwise() %>% 
            mutate(PT_Overall = mean(c_across(starts_with("Lineage")), na.rm = TRUE)) %>% 
            ungroup() %>% 
            mutate(across(c(starts_with("Lineage"), PT_Overall), 
                          \(x) (x - min(x, na.rm = TRUE)) / (max(x, na.rm = TRUE) - min(x, na.rm = TRUE))))
seu_brain <- AddMetaData(seu_brain, 
                         metadata = sling_pt$PT_Overall, 
                         col.name = "PT_Overall")

Above, we created a mean-aggregated pseudotime across all three lineages. This serves as a proxy measurement for overall developmental progression.

Code

p6 <- data.frame(Embeddings(seu_brain, "umap")) %>% 
      mutate(PT_Overall = seu_brain$PT_Overall) %>% 
      ggplot(aes(x = umap_1, y = umap_2, color = PT_Overall)) + 
      geom_point(size = 1.5, 
                 alpha = 0.75, 
                 stroke = 0) + 
      scale_color_gradientn(colors = palette_heatmap, labels = scales::label_number(accuracy = .01)) + 
      labs(x = "UMAP 1", 
           y = "UMAP 2", 
           color = "Mean Pseudotime") + 
      theme_scLANE(umap = TRUE)
p6

Figure 6: UMAP embedding colored by mean-aggregated pseudotime.

In general, larger pseudotime values are generated for later developmental stages.

Code

p7 <- data.frame(PT_Overall = sling_pt$PT_Overall, 
                 week = seu_brain$week) %>% 
      ggplot(aes(x = week, y = PT_Overall, color = week)) + 
      ggbeeswarm::geom_quasirandom(size = 1.5,
                                   alpha = 0.75, 
                                   stroke = 0, 
                                   show.legend = FALSE) + 
      stat_summary(geom = "point", 
                   fun = "mean",
                   color =  "black", 
                   size = 3.5) + 
      scale_color_manual(values = palette_week) + 
      labs(x = "Gestational Week", y = "Mean Pseudotime") +
      theme_scLANE()
p7

Figure 7: Beeswarm plots of the distribution of mean-aggregated pseudotime for each gestational week.

During pseudotime estimation Slingshot estimates a minimum spanning tree (MST), which serves as a rough graph abstraction of the relationships between celltypes. Our MST looks to correspond well with our exiting biological knowledge.

Code

p8 <- data.frame(Embeddings(seu_brain, "umap")) %>% 
      mutate(celltype = seu_brain$celltype) %>% 
      ggplot(aes(x = umap_1, y = umap_2, color = celltype)) + 
      geom_point(size = 1.5, 
                 alpha = 0.75, 
                 stroke = 0) + 
      geom_path(data = sling_mst, mapping = aes(x = umap_1, y = umap_2, group = Lineage), 
                linewidth = 1.25, 
                color = "black") + 
      geom_point(data = sling_mst, mapping = aes(x = umap_1, y = umap_2, fill = Cluster), 
                color = "black", 
                shape = 21, 
                size = 4.5, 
                stroke = 1.25, 
                show.legend = FALSE) + 
      scale_color_manual(values = palette_celltype) + 
      scale_fill_manual(values = palette_celltype) + 
      labs(x = "UMAP 1", y = "UMAP 2") + 
      theme_scLANE(umap = TRUE) +  
      theme(legend.title = element_blank()) + 
      guides(color = guide_legend(override.aes = list(size = 4, alpha = 1, stroke = 0.25)))
p8

Figure 8: UMAP embedding with MST from Slinghot overlaid.

The three lineage seem to correctly identify the developmental relationships we’re looking for.

Code

p9 <- data.frame(Embeddings(seu_brain, "umap")) %>% 
      bind_cols(sling_pt) %>% 
      tidyr::pivot_longer(starts_with("Lineage"), 
                          names_to = "lineage", 
                          values_to = "pseudotime") %>% 
      ggplot(aes(x = umap_1, y = umap_2, color = pseudotime)) + 
      facet_wrap(~lineage, nrow = 3) + 
      geom_point(size = 1.5, 
                 alpha = 0.75, 
                 stroke = 0) + 
      labs(x = "UMAP 1", 
           y = "UMAP 2", 
           color = "Pseudotime") + 
      scale_color_gradientn(colors = palette_heatmap, labels = scales::label_number(accuracy = .01)) + 
      theme_scLANE(umap = TRUE)
p9

Figure 9: UMAP embedding colored by lineage-specific pseudotime.

As such, we add a celltype label to each lineage corresponding to each lineage’s terminal cell fate.

Code

sling_pt_long <- data.frame(Embeddings(seu_brain, "umap")) %>% 
                 bind_cols(sling_pt) %>% 
                 tidyr::pivot_longer(starts_with("Lineage"), 
                                     names_to = "lineage", 
                                     values_to = "pseudotime") %>% 
                 mutate(lineage_label = case_when(lineage == "Lineage1" ~ "Excitatory Neuron", 
                                                  lineage == "Lineage2" ~ "Astrocyte", 
                                                  lineage == "Lineage3" ~ "Oligodendrocyte Progenitor", 
                                                  TRUE ~ NA_character_))

We can also plot the principal curves from Slingshot on our UMAP embedding. While the curves are a little noisy, they overall seem to recapitulate the underlying biology well.

Code

p10 <- data.frame(Embeddings(seu_brain, "umap")) %>% 
       mutate(celltype = seu_brain$celltype) %>% 
       ggplot(aes(x = umap_1, y = umap_2, color = celltype)) + 
       geom_point(size = 1.5, 
                  alpha = 0.75, 
                  stroke = 0) + 
       geom_path(data = sling_curves,
                 mapping = aes(x = umap_1, y = umap_2, group = Lineage), 
                 color = "black", 
                 linewidth = 1.5, 
                 alpha = 0.75, 
                 lineend = "round") + 
       scale_color_manual(values = palette_celltype) + 
       labs(x = "UMAP 1", y = "UMAP 2") + 
       theme_scLANE(umap = TRUE) + 
       theme(legend.title = element_blank()) + 
       guides(color = guide_legend(override.aes = list(size = 4, alpha = 1, stroke = 0.25)))
p10

Figure 10: UMAP embedding with principal curves from Slinghot overlaid.

6.2.2 Trajectory DE testing

Now that we have a pseudotime estimate for each cell across each lineage, we can perform trajectory differential expression (DE) testing using our package scLANE. We test only the top 3,000 HVGs, a heuristic which is motivated by the fact that the UMAP embedding was generated using just expression values from the HVG set. As such, non-HVGs don’t have a direct association with pseudotime, and for simple analyses it’s usually OK to refrain from testing them.

Code

cell_offset <- createCellOffset(seu_brain)
pt_df <- data.frame(PT1 = sling_pt$Lineage1, 
                    PT2 = sling_pt$Lineage2, 
                    PT3 = sling_pt$Lineage3)
scLANE_models <- testDynamic(seu_brain, 
                             pt = pt_df, 
                             genes = VariableFeatures(seu_brain), 
                             size.factor.offset = cell_offset, 
                             n.cores = 6L, 
                             verbose = FALSE)

scLANE testing completed for 3000 genes across 3 lineages in 13.112 mins

Code

scLANE_de_res <- getResultsDE(scLANE_models)

After running the getResultsDE() function we now have a tidy table of differential expression statistics.

Code

select(scLANE_de_res, 
       Gene, 
       Lineage, 
       Test_Stat, 
       P_Val, 
       P_Val_Adj,
       Gene_Dynamic_Overall) %>% 
  mutate(Lineage = case_when(Lineage == "A" ~ "Excitatory Neuron", 
                             Lineage == "B" ~ "Astrocyte", 
                             Lineage == "C" ~ "Oligodendrocyte Progenitor", 
                             TRUE ~ NA_character_), 
         Gene_Dynamic_Overall = if_else(Gene_Dynamic_Overall == 1, "Dynamic", "Static")) %>% 
  slice_head(n = 10) %>% 
  kableExtra::kbl(digits = 4, 
                  booktabs = TRUE, 
                  col.names = c("Gene", "Lineage", "LRT stat.", "P-value", "Adj. p-value", "Predicted gene status")) %>% 
  kableExtra::kable_classic(full_width = FALSE, "hover")

Gene	Lineage	LRT stat.	Predicted gene status
GTSE1	Astrocyte	67619.12	Dynamic
GTSE1	Oligodendrocyte Progenitor	49494.67	Dynamic
C21orf58	Oligodendrocyte Progenitor	43008.00	Dynamic
ZWINT	Astrocyte	31439.11	Dynamic
CKAP2L	Oligodendrocyte Progenitor	29424.94	Dynamic
PLK1	Oligodendrocyte Progenitor	27513.28	Dynamic
LOC100505715	Astrocyte	26932.25	Dynamic
ZWINT	Oligodendrocyte Progenitor	23814.28	Dynamic
KIF20A	Astrocyte	23453.18	Dynamic
HIST1H1D	Oligodendrocyte Progenitor	21592.17	Dynamic

Table 1: Sample DE test output from scLANE.

6.2.3 Gene dynamics

Here we visualize the fitted dynamics and knots for three transcription factors (TFs) investigated in the original paper: SRY-box transcription factor 2 (SOX2), paired box 6 (PAX6), and oligodendrocyte transcription factor 1 (OLIG1). All three TFs have well-annotated roles in regulating neurogenesis in humans and other species e.g., Mus musculus. Interestingly, while SOX2 and OLIG1 have markedly different dynamics and knots (transcriptional switches) across lineages, the knots chosen for PAX6 are pretty similar for all three lineages - even though the dynamics differ.

Code

p11 <- getFittedValues(scLANE_models, 
                       genes = c("SOX2", "PAX6", "OLIG1"), 
                       pt = pt_df, 
                       expr.mat = seu_brain, 
                       size.factor.offset = cell_offset, 
                       cell.meta.data = select(seu_brain@meta.data, celltype)) %>% 
       mutate(lineage_label = case_when(lineage == "A" ~ "Excitatory Neuron", 
                                        lineage == "B" ~ "Astrocyte", 
                                        lineage == "C" ~ "Oligodendrocyte Progenitor", 
                                        TRUE ~ NA_character_)) %>% 
       ggplot(aes(x = pt, y = rna_log1p)) + 
       facet_grid(lineage_label~gene) + 
       geom_point(aes(color = celltype), 
                  size = 2, 
                  alpha = 0.75, 
                  stroke = 0) + 
       geom_vline(data = data.frame(gene = scLANE_models$SOX2$Lineage_A$MARGE_Slope_Data$Gene, 
                                    lineage_label = "Excitatory Neuron", 
                                    knot = scLANE_models$SOX2$Lineage_A$MARGE_Slope_Data$Breakpoint), 
                  mapping = aes(xintercept = knot), 
                  linetype = "dashed", 
                  color = "grey20") + 
       geom_vline(data = data.frame(gene = scLANE_models$SOX2$Lineage_B$MARGE_Slope_Data$Gene, 
                                    lineage_label = "Astrocyte", 
                                    knot = scLANE_models$SOX2$Lineage_B$MARGE_Slope_Data$Breakpoint), 
                  mapping = aes(xintercept = knot), 
                  linetype = "dashed", 
                  color = "grey20") + 
       geom_vline(data = data.frame(gene = scLANE_models$SOX2$Lineage_C$MARGE_Slope_Data$Gene, 
                                    lineage_label = "Oligodendrocyte Progenitor", 
                                    knot = scLANE_models$SOX2$Lineage_C$MARGE_Slope_Data$Breakpoint), 
                  mapping = aes(xintercept = knot), 
                  linetype = "dashed", 
                  color = "grey20") + 
       geom_vline(data = data.frame(gene = scLANE_models$PAX6$Lineage_A$MARGE_Slope_Data$Gene, 
                                    lineage_label = "Excitatory Neuron", 
                                    knot = scLANE_models$PAX6$Lineage_A$MARGE_Slope_Data$Breakpoint), 
                  mapping = aes(xintercept = knot), 
                  linetype = "dashed", 
                  color = "grey20") + 
       geom_vline(data = data.frame(gene = scLANE_models$PAX6$Lineage_B$MARGE_Slope_Data$Gene, 
                                    lineage_label = "Astrocyte", 
                                    knot = scLANE_models$PAX6$Lineage_B$MARGE_Slope_Data$Breakpoint), 
                  mapping = aes(xintercept = knot), 
                  linetype = "dashed", 
                  color = "grey20") + 
       geom_vline(data = data.frame(gene = scLANE_models$PAX6$Lineage_C$MARGE_Slope_Data$Gene, 
                                    lineage_label = "Oligodendrocyte Progenitor", 
                                    knot = scLANE_models$PAX6$Lineage_C$MARGE_Slope_Data$Breakpoint), 
                  mapping = aes(xintercept = knot), 
                  linetype = "dashed", 
                  color = "grey20") + 
       geom_vline(data = data.frame(gene = scLANE_models$OLIG1$Lineage_A$MARGE_Slope_Data$Gene, 
                                    lineage_label = "Excitatory Neuron", 
                                    knot = scLANE_models$OLIG1$Lineage_A$MARGE_Slope_Data$Breakpoint), 
                  mapping = aes(xintercept = knot), 
                  linetype = "dashed", 
                  color = "grey20") + 
       geom_vline(data = data.frame(gene = scLANE_models$OLIG1$Lineage_B$MARGE_Slope_Data$Gene, 
                                    lineage_label = "Astrocyte", 
                                    knot = scLANE_models$OLIG1$Lineage_B$MARGE_Slope_Data$Breakpoint), 
                  mapping = aes(xintercept = knot), 
                  linetype = "dashed", 
                  color = "grey20") + 
       geom_vline(data = data.frame(gene = scLANE_models$OLIG1$Lineage_C$MARGE_Slope_Data$Gene, 
                                    lineage_label = "Oligodendrocyte Progenitor", 
                                    knot = scLANE_models$OLIG1$Lineage_C$MARGE_Slope_Data$Breakpoint), 
                  mapping = aes(xintercept = knot), 
                  linetype = "dashed", 
                  color = "grey20") + 
       geom_ribbon(aes(ymin = scLANE_ci_ll_log1p, ymax = scLANE_ci_ul_log1p), 
                   linewidth = 0, 
                   fill = "grey70", 
                   alpha = 0.75) + 
       geom_line(aes(y = scLANE_pred_log1p), 
                 color = "black", 
                 linewidth = 0.75) + 
       scale_x_continuous(labels = scales::label_number(accuracy = 0.01)) + 
       scale_color_manual(values = palette_celltype) + 
       labs(x = "Pseudotime", 
            y = "Normalized Expression") + 
       theme_scLANE() + 
       theme(legend.title = element_blank(), 
             strip.text.x = element_text(face = "italic"), 
             legend.position = "bottom") + 
       guides(color = guide_legend(override.aes = list(size = 4, alpha = 1, stroke = 0.25)))
p11

Figure 11: Gene dynamics for three transcription factors of interest.

6.2.4 Distribution of knots from `scLANE`

Next, we use the getKnostDist() function to pull the location of every identified knot for genes dynamic across any of the three lineages.

Code

dyn_genes <- filter(scLANE_de_res, Gene_Dynamic_Overall == 1) %>% 
             distinct(Gene) %>% 
             pull(Gene)
knot_dist <- getKnotDist(scLANE_models, dyn.genes = dyn_genes)

The output of the function is rather simple:

Code

mutate(knot_dist, 
       lineage = case_when(lineage == "Lineage_A" ~ "Excitatory Neuron", 
                           lineage == "Lineage_B" ~ "Astrocyte", 
                           lineage == "Lineage_C" ~ "Oligodendrocyte Progenitor", 
                           TRUE ~ NA_character_)) %>% 
  slice_sample(n = 5) %>% 
  kableExtra::kable(digits = 4, row.names = FALSE, col.names = c("Gene", "Lineage", "Knot")) %>% 
  kableExtra::kable_classic(full_width = FALSE, "hover")

Gene	Lineage	Knot
ITGA2	Astrocyte	0.7268
BUB3	Oligodendrocyte Progenitor	0.2086
MASP1	Astrocyte	0.8137
MCM7	Excitatory Neuron	0.6963
RHOC	Oligodendrocyte Progenitor	0.9549

Table 2: Sample output displaying per-gene, per-lineage knot locations in pseudotime.

We can visualize the differences between lineages using a combination histogram and density plot. Interestingly, the oligodendrocyte progenitor lineage has a large number of knots just prior to ~0.75 pseudotime. This is about where the “break” in the UMAP embedding occurs (see e.g., Figure 10), hence the placement of knots there by scLANE makes sense.

Code

p12 <- mutate(knot_dist, 
              lineage_label = case_when(lineage == "Lineage_A" ~ "Excitatory Neuron", 
                                        lineage == "Lineage_B" ~ "Astrocyte", 
                                        lineage == "Lineage_C" ~ "Oligodendrocyte Progenitor", 
                                        TRUE ~ NA_character_)) %>% 
       ggplot(aes(x = knot)) + 
       facet_wrap(~lineage_label, nrow = 3) + 
       geom_histogram(aes(y = after_stat(density)), 
                      fill = "white", color = "navy", linewidth = 0.5) + 
       geom_density(fill = "steelblue3", color = "steelblue4", alpha = 0.5, linewidth = 0.75) + 
       labs(x = "Knot Location", y = "Density") + 
       theme_scLANE() + 
       theme(axis.ticks.y = element_blank(), 
             axis.text.y = element_blank())
p12

Figure 12: Empirical distribution of knots across pseudotime for each lineage.

6.2.5 Gene-level embeddings

After generating three lineage-specific matrices of gene dynamics using the smoothedCountsMatrix() function, we use embedGenes() to cluster the gene dynamics within each lineage and embed them in dimension-reduced space with PCA & UMAP.

Code

smoothed_counts <- smoothedCountsMatrix(scLANE_models, 
                                        size.factor.offset = cell_offset, 
                                        pt = pt_df, 
                                        genes = dyn_genes, 
                                        n.cores = 4L)
gene_embed_neuron <- embedGenes(log1p(smoothed_counts$Lineage_A))
gene_embed_astro <- embedGenes(log1p(smoothed_counts$Lineage_B))
gene_embed_opc <- embedGenes(log1p(smoothed_counts$Lineage_C))

We can visualize the differing gene-level UMAP embeddings, showing heterogeneity across lineages:

Code

p13 <- ggplot(gene_embed_neuron, aes(x = umap1, y = umap2, color = leiden)) + 
       geom_point(size = 2, 
                  alpha = 0.75, 
                  stroke = 0) + 
       scale_color_manual(values = palette_week[1:length(unique(gene_embed_neuron$leiden))]) + 
       labs(x = "UMAP 1",
            y = "UMAP 2", 
            color = "Leiden", 
            subtitle = "Interneuron Lineage") + 
       theme_scLANE(umap = TRUE) + 
       guides(color = guide_legend(override.aes = list(size = 4, alpha = 1, stroke = 0.25)))
p14 <- ggplot(gene_embed_astro, aes(x = umap1, y = umap2, color = leiden)) + 
       geom_point(size = 2, 
                  alpha = 0.75, 
                  stroke = 0) + 
       scale_color_manual(values = palette_week[-c(1:length(unique(gene_embed_neuron$leiden)))]) + 
       labs(x = "UMAP 1", 
            y = "UMAP 2", 
            color = "Leiden", 
            subtitle = "Astrocyte Lineage") + 
       theme_scLANE(umap = TRUE) + 
       guides(color = guide_legend(override.aes = list(size = 4, alpha = 1, stroke = 0.25)))
p15 <- ggplot(gene_embed_opc, aes(x = umap1, y = umap2, color = leiden)) + 
       geom_point(size = 2, 
                  alpha = 0.75, 
                  stroke = 0) + 
       scale_color_manual(values = palette_week[-c(1:(length(unique(gene_embed_neuron$leiden)) + length(unique(gene_embed_astro$leiden))))]) + 
       labs(x = "UMAP 1", 
            y = "UMAP 2", 
            color = "Leiden", 
            subtitle = "Oligodendrocyte Progenitor Lineage") + 
       theme_scLANE(umap = TRUE) + 
       guides(color = guide_legend(override.aes = list(size = 4, alpha = 1, stroke = 0.25)))
p16 <- p13 + p14 + p15 + plot_layout(axes = "collect")
p16

Figure 13: Lineage-specific embeddings of gene dynamics.

6.2.6 Gene set enrichment analysis

Lastly, we can compare the dynamic gene sets for each lineage by performing enrichment analysis on each. The enrichDynamicGenes() function does this using the gprofiler2 package internally.

Code

enrich_neuron <- enrichDynamicGenes(scLANE_de_res, lineage = "A")
enrich_astro <- enrichDynamicGenes(scLANE_de_res, lineage = "B")
enrich_opc <- enrichDynamicGenes(scLANE_de_res, lineage = "C")

Examining the top biological processes enriched in each lineage gives us some idea as to what each dynamic gene set is doing. For example, the interneuron lineage is correctly characterized by terms related to neurogenesis & neuronal differentiation.

Code

top5_terms <- filter(enrich_neuron$result, source == "GO:BP") %>% 
              mutate(lineage = "Interneuron", .before = 1) %>% 
              bind_rows((filter(enrich_astro$result, source == "GO:BP") %>% 
                         mutate(lineage = "Astrocyte", .before = 1))) %>% 
              bind_rows((filter(enrich_opc$result, source == "GO:BP") %>% 
                         mutate(lineage = "Oligodendrocyte Progenitor", .before = 1))) %>% 
              select(lineage, term_name, p_value) %>% 
              arrange(p_value) %>% 
              with_groups(lineage, 
                          slice_head, 
                          n = 5)
kableExtra::kable(top5_terms, digits = 3, row.names = FALSE, col.names = c("Lineage", "Term", "P-value")) %>% 
  kableExtra::kable_classic(full_width = FALSE, "hover")

Lineage	Term	P-value
Astrocyte	mitotic cell cycle	0
Astrocyte	cell cycle	0
Astrocyte	mitotic cell cycle process	0
Astrocyte	cell cycle process	0
Astrocyte	chromosome segregation	0
Interneuron	nervous system development	0
Interneuron	system development	0
Interneuron	neurogenesis	0
Interneuron	multicellular organism development	0
Interneuron	generation of neurons	0
Oligodendrocyte Progenitor	mitotic cell cycle	0
Oligodendrocyte Progenitor	cell cycle	0
Oligodendrocyte Progenitor	chromosome segregation	0
Oligodendrocyte Progenitor	cell division	0
Oligodendrocyte Progenitor	chromosome organization	0

Table 3: The top-5 GO:BP terms for the dynamic genes from each lineage.

We can also identify sets of terms specific to each lineage; here we do so for the oligodendrocyte lineage.

Code

spec_terms_opc <- filter(enrich_opc$result, source == "GO:BP", 
                         !term_id %in% (filter(enrich_neuron$result, p_value < 0.05) %>% pull(term_id)), 
                         !term_id %in% (filter(enrich_astro$result, p_value < 0.05) %>% pull(term_id)), 
                         p_value < 0.05) %>% 
                  mutate(lineage = "Oligodendrocyte Progenitor", .before = 1)

We see several terms significantly enriched in the lineage that are specific to OPC development. This serves as confirmatory evidence that our trajectory was fit correctly and that our trajectory DE results are biologically meaningful.

Code

filter(spec_terms_opc, grepl("oligodendrocyte", term_name)) %>% 
  select(lineage, term_name, p_value) %>% 
  kableExtra::kable(digits = 3, row.names = FALSE, col.names = c("Lineage", "Term", "P-value")) %>% 
  kableExtra::kable_classic(full_width = FALSE, "hover")

Lineage	Term	P-value
Oligodendrocyte Progenitor	oligodendrocyte cell fate specification	0.028
Oligodendrocyte Progenitor	oligodendrocyte cell fate commitment	0.028

Table 4: Significantly enriched OPC-related biological process.

7 Conclusions

Here we showcased an end-to-end trajectory analysis, from initial celltype annotation to trajectory inference. After fitting a trajectory with Slingshot in a supervised manner using existing biological knowledge, we performed trajectory differential expression testing with scLANE. This opened up several downstream analysis options, including the comparison of gene dynamics across lineages, the inspection of the locations of transcriptional switches, and lineage-specific trajectory enrichment analysis. In doing so, we both recapitulated points from the original authors’ analysis (e.g., visualizing dynamics of relevant transcription factors) and added to them (e.g., performing trajectory enrichment analysis). For more information on scLANE, see the corresponding preprint or the GitHub repository.

8 Session info

Code

sessioninfo::session_info()

─ Session info ───────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.2 (2023-10-31)
 os       macOS Sonoma 14.3
 system   x86_64, darwin20
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       America/New_York
 date     2024-02-27
 pandoc   3.1.9 @ /usr/local/bin/ (via rmarkdown)

─ Packages ───────────────────────────────────────────────────────────────────
 package                * version    date (UTC) lib source
 abind                    1.4-5      2016-07-21 [1] CRAN (R 4.3.0)
 AnnotationDbi            1.64.1     2023-11-03 [1] Bioconductor
 AnnotationFilter         1.26.0     2023-10-24 [1] Bioconductor
 AnnotationHub            3.10.0     2023-10-24 [1] Bioconductor
 backports                1.4.1      2021-12-13 [1] CRAN (R 4.3.0)
 beeswarm                 0.4.0      2021-06-01 [1] CRAN (R 4.3.0)
 bigassertr               0.1.6      2023-01-10 [1] CRAN (R 4.3.0)
 bigparallelr             0.3.2      2021-10-02 [1] CRAN (R 4.3.0)
 bigstatsr                1.5.12     2022-10-14 [1] CRAN (R 4.3.0)
 Biobase                * 2.62.0     2023-10-24 [1] Bioconductor
 BiocFileCache            2.10.1     2023-10-26 [1] Bioconductor
 BiocGenerics           * 0.48.1     2023-11-01 [1] Bioconductor
 BiocIO                   1.12.0     2023-10-24 [1] Bioconductor
 BiocManager              1.30.22    2023-08-08 [1] CRAN (R 4.3.0)
 BiocNeighbors            1.20.1     2023-12-18 [1] Bioconductor 3.18 (R 4.3.2)
 BiocParallel             1.36.0     2023-10-24 [1] Bioconductor
 BiocVersion              3.18.1     2023-11-15 [1] Bioconductor
 biomaRt                  2.58.0     2023-10-24 [1] Bioconductor
 Biostrings               2.70.1     2023-10-25 [1] Bioconductor
 bit                      4.0.5      2022-11-15 [1] CRAN (R 4.3.0)
 bit64                    4.0.5      2020-08-30 [1] CRAN (R 4.3.0)
 bitops                   1.0-7      2021-04-24 [1] CRAN (R 4.3.0)
 blob                     1.2.4      2023-03-17 [1] CRAN (R 4.3.0)
 bluster                  1.12.0     2023-10-24 [1] Bioconductor
 boot                     1.3-28.1   2022-11-22 [1] CRAN (R 4.3.2)
 broom                    1.0.5      2023-06-09 [1] CRAN (R 4.3.0)
 broom.mixed              0.2.9.4    2022-04-17 [1] CRAN (R 4.3.0)
 cachem                   1.0.8      2023-05-01 [1] CRAN (R 4.3.0)
 cli                      3.6.2      2023-12-11 [1] CRAN (R 4.3.0)
 cluster                  2.1.6      2023-12-01 [1] CRAN (R 4.3.0)
 coda                     0.19-4     2020-09-30 [1] CRAN (R 4.3.0)
 codetools                0.2-19     2023-02-01 [1] CRAN (R 4.3.2)
 colorspace               2.1-0      2023-01-23 [1] CRAN (R 4.3.0)
 coop                     0.6-3      2021-09-19 [1] CRAN (R 4.3.0)
 cowplot                  1.1.2      2023-12-15 [1] CRAN (R 4.3.0)
 crayon                   1.5.2      2022-09-29 [1] CRAN (R 4.3.0)
 curl                     5.2.0      2023-12-08 [1] CRAN (R 4.3.0)
 data.table               1.14.10    2023-12-08 [1] CRAN (R 4.3.0)
 DBI                      1.2.0      2023-12-21 [1] CRAN (R 4.3.0)
 dbplyr                   2.4.0      2023-10-26 [1] CRAN (R 4.3.0)
 DelayedArray             0.28.0     2023-10-24 [1] Bioconductor
 DelayedMatrixStats       1.24.0     2023-10-24 [1] Bioconductor
 deldir                   2.0-2      2023-11-23 [1] CRAN (R 4.3.0)
 digest                   0.6.33     2023-07-07 [1] CRAN (R 4.3.0)
 doParallel               1.0.17     2022-02-07 [1] CRAN (R 4.3.0)
 doSNOW                   1.0.20     2022-02-04 [1] CRAN (R 4.3.0)
 dotCall64                1.1-1      2023-11-28 [1] CRAN (R 4.3.0)
 dplyr                  * 1.1.4      2023-11-17 [1] CRAN (R 4.3.0)
 ellipsis                 0.3.2      2021-04-29 [1] CRAN (R 4.3.0)
 emmeans                  1.10.0     2024-01-23 [1] CRAN (R 4.3.2)
 ensembldb                2.26.0     2023-10-24 [1] Bioconductor
 estimability             1.4.1      2022-08-05 [1] CRAN (R 4.3.0)
 evaluate                 0.23       2023-11-01 [1] CRAN (R 4.3.0)
 ExperimentHub            2.10.0     2023-10-24 [1] Bioconductor
 fansi                    1.0.6      2023-12-08 [1] CRAN (R 4.3.0)
 farver                   2.1.1      2022-07-06 [1] CRAN (R 4.3.0)
 fastDummies              1.7.3      2023-07-06 [1] CRAN (R 4.3.0)
 fastmap                  1.1.1      2023-02-24 [1] CRAN (R 4.3.0)
 ff                       4.0.9      2023-01-25 [1] CRAN (R 4.3.0)
 filelock                 1.0.3      2023-12-11 [1] CRAN (R 4.3.0)
 fitdistrplus             1.1-11     2023-04-25 [1] CRAN (R 4.3.0)
 flock                    0.7        2016-11-12 [1] CRAN (R 4.3.0)
 forcats                  1.0.0      2023-01-29 [1] CRAN (R 4.3.0)
 foreach                  1.5.2      2022-02-02 [1] CRAN (R 4.3.0)
 furrr                    0.3.1      2022-08-15 [1] CRAN (R 4.3.0)
 future                   1.33.1     2023-12-22 [1] CRAN (R 4.3.0)
 future.apply             1.11.1     2023-12-21 [1] CRAN (R 4.3.0)
 gamlss                   5.4-20     2023-10-04 [1] CRAN (R 4.3.0)
 gamlss.data              6.0-2      2021-11-07 [1] CRAN (R 4.3.0)
 gamlss.dist              6.1-1      2023-08-23 [1] CRAN (R 4.3.0)
 geeM                     0.10.1     2018-06-18 [1] CRAN (R 4.3.0)
 generics                 0.1.3      2022-07-05 [1] CRAN (R 4.3.0)
 GenomeInfoDb           * 1.38.5     2023-12-28 [1] Bioconductor 3.18 (R 4.3.2)
 GenomeInfoDbData         1.2.11     2023-12-22 [1] Bioconductor
 GenomicAlignments        1.38.0     2023-10-24 [1] Bioconductor
 GenomicFeatures          1.54.1     2023-10-29 [1] Bioconductor
 GenomicRanges          * 1.54.1     2023-10-29 [1] Bioconductor
 ggbeeswarm               0.7.2      2023-04-29 [1] CRAN (R 4.3.0)
 ggplot2                * 3.4.4      2023-10-12 [1] CRAN (R 4.3.0)
 ggrepel                  0.9.5      2024-01-10 [1] CRAN (R 4.3.0)
 ggridges                 0.5.5      2023-12-15 [1] CRAN (R 4.3.0)
 glm2                   * 1.2.1      2018-08-11 [1] CRAN (R 4.3.0)
 glmmTMB                  1.1.8      2023-10-07 [1] CRAN (R 4.3.0)
 globals                  0.16.2     2022-11-21 [1] CRAN (R 4.3.0)
 glue                     1.6.2      2022-02-24 [1] CRAN (R 4.3.0)
 goftest                  1.2-3      2021-10-07 [1] CRAN (R 4.3.0)
 gprofiler2               0.2.2      2023-06-14 [1] CRAN (R 4.3.0)
 gridExtra                2.3        2017-09-09 [1] CRAN (R 4.3.0)
 gtable                   0.3.4      2023-08-21 [1] CRAN (R 4.3.0)
 highr                    0.10       2022-12-22 [1] CRAN (R 4.3.0)
 hms                      1.1.3      2023-03-21 [1] CRAN (R 4.3.0)
 htmltools                0.5.7      2023-11-03 [1] CRAN (R 4.3.0)
 htmlwidgets              1.6.4      2023-12-06 [1] CRAN (R 4.3.0)
 httpuv                   1.6.13     2023-12-06 [1] CRAN (R 4.3.0)
 httr                     1.4.7      2023-08-15 [1] CRAN (R 4.3.0)
 ica                      1.0-3      2022-07-08 [1] CRAN (R 4.3.0)
 igraph                   2.0.1.1    2024-01-30 [1] CRAN (R 4.3.2)
 interactiveDisplayBase   1.40.0     2023-10-24 [1] Bioconductor
 IRanges                * 2.36.0     2023-10-24 [1] Bioconductor
 irlba                    2.3.5.1    2022-10-03 [1] CRAN (R 4.3.0)
 iterators                1.0.14     2022-02-05 [1] CRAN (R 4.3.0)
 jsonlite                 1.8.8      2023-12-04 [1] CRAN (R 4.3.0)
 kableExtra               1.3.4      2021-02-20 [1] CRAN (R 4.3.0)
 KEGGREST                 1.42.0     2023-10-24 [1] Bioconductor
 KernSmooth               2.23-22    2023-07-10 [1] CRAN (R 4.3.2)
 knitr                    1.45       2023-10-30 [1] CRAN (R 4.3.0)
 labeling                 0.4.3      2023-08-29 [1] CRAN (R 4.3.0)
 later                    1.3.2      2023-12-06 [1] CRAN (R 4.3.0)
 lattice                  0.22-5     2023-10-24 [1] CRAN (R 4.3.0)
 lazyeval                 0.2.2      2019-03-15 [1] CRAN (R 4.3.0)
 leiden                   0.4.3.1    2023-11-17 [1] CRAN (R 4.3.0)
 lifecycle                1.0.4      2023-11-07 [1] CRAN (R 4.3.0)
 limma                    3.58.1     2023-10-31 [1] Bioconductor
 listenv                  0.9.0      2022-12-16 [1] CRAN (R 4.3.0)
 lme4                     1.1-35.1   2023-11-05 [1] CRAN (R 4.3.0)
 lmtest                   0.9-40     2022-03-21 [1] CRAN (R 4.3.0)
 magrittr               * 2.0.3      2022-03-30 [1] CRAN (R 4.3.0)
 MASS                     7.3-60     2023-05-04 [1] CRAN (R 4.3.0)
 Matrix                   1.6-5      2024-01-11 [1] CRAN (R 4.3.0)
 MatrixGenerics         * 1.14.0     2023-10-24 [1] Bioconductor
 matrixStats            * 1.2.0      2023-12-11 [1] CRAN (R 4.3.0)
 memoise                  2.0.1      2021-11-26 [1] CRAN (R 4.3.0)
 mgcv                     1.9-1      2023-12-21 [1] CRAN (R 4.3.0)
 mime                     0.12       2021-09-28 [1] CRAN (R 4.3.0)
 miniUI                   0.1.1.1    2018-05-18 [1] CRAN (R 4.3.0)
 minqa                    1.2.6      2023-09-11 [1] CRAN (R 4.3.0)
 munsell                  0.5.0      2018-06-12 [1] CRAN (R 4.3.0)
 mvtnorm                  1.2-4      2023-11-27 [1] CRAN (R 4.3.0)
 nlme                     3.1-164    2023-11-27 [1] CRAN (R 4.3.0)
 nloptr                   2.0.3      2022-05-26 [1] CRAN (R 4.3.0)
 numDeriv                 2016.8-1.1 2019-06-06 [1] CRAN (R 4.3.0)
 paletteer                1.6.0      2024-01-21 [1] CRAN (R 4.3.0)
 parallelly               1.36.0     2023-05-26 [1] CRAN (R 4.3.0)
 patchwork              * 1.2.0      2024-01-08 [1] CRAN (R 4.3.0)
 pbapply                  1.7-2      2023-06-27 [1] CRAN (R 4.3.0)
 pillar                   1.9.0      2023-03-22 [1] CRAN (R 4.3.0)
 pkgconfig                2.0.3      2019-09-22 [1] CRAN (R 4.3.0)
 plotly                   4.10.4     2024-01-13 [1] CRAN (R 4.3.0)
 plyr                     1.8.9      2023-10-02 [1] CRAN (R 4.3.0)
 png                      0.1-8      2022-11-29 [1] CRAN (R 4.3.0)
 polyclip                 1.10-6     2023-09-27 [1] CRAN (R 4.3.0)
 presto                   1.0.0      2023-12-27 [1] Github (immunogenomics/presto@31dc97f)
 prettyunits              1.2.0      2023-09-24 [1] CRAN (R 4.3.0)
 princurve              * 2.1.6      2021-01-18 [1] CRAN (R 4.3.0)
 prismatic                1.1.1      2022-08-15 [1] CRAN (R 4.3.0)
 progress                 1.2.3      2023-12-06 [1] CRAN (R 4.3.0)
 progressr                0.14.0     2023-08-10 [1] CRAN (R 4.3.0)
 promises                 1.2.1      2023-08-10 [1] CRAN (R 4.3.0)
 ProtGenerics             1.34.0     2023-10-24 [1] Bioconductor
 ps                       1.7.5      2023-04-18 [1] CRAN (R 4.3.0)
 purrr                    1.0.2      2023-08-10 [1] CRAN (R 4.3.0)
 R6                       2.5.1      2021-08-19 [1] CRAN (R 4.3.0)
 RANN                     2.6.1      2019-01-08 [1] CRAN (R 4.3.0)
 rappdirs                 0.3.3      2021-01-31 [1] CRAN (R 4.3.0)
 RColorBrewer             1.1-3      2022-04-03 [1] CRAN (R 4.3.0)
 Rcpp                     1.0.11     2023-07-06 [1] CRAN (R 4.3.0)
 RcppAnnoy                0.0.21     2023-07-02 [1] CRAN (R 4.3.0)
 RcppEigen                0.3.3.9.4  2023-11-02 [1] CRAN (R 4.3.0)
 RcppHNSW                 0.5.0      2023-09-19 [1] CRAN (R 4.3.0)
 RCurl                    1.98-1.13  2023-11-02 [1] CRAN (R 4.3.0)
 rematch2                 2.1.2      2020-05-01 [1] CRAN (R 4.3.0)
 reshape2                 1.4.4      2020-04-09 [1] CRAN (R 4.3.0)
 restfulr                 0.0.15     2022-06-16 [1] CRAN (R 4.3.0)
 reticulate               1.34.0     2023-10-12 [1] CRAN (R 4.3.0)
 rjson                    0.2.21     2022-01-09 [1] CRAN (R 4.3.0)
 rlang                    1.1.2      2023-11-04 [1] CRAN (R 4.3.0)
 rmarkdown                2.25       2023-09-18 [1] CRAN (R 4.3.0)
 rmio                     0.4.0      2022-02-17 [1] CRAN (R 4.3.0)
 ROCR                     1.0-11     2020-05-02 [1] CRAN (R 4.3.0)
 Rsamtools                2.18.0     2023-10-24 [1] Bioconductor
 RSpectra                 0.16-1     2022-04-24 [1] CRAN (R 4.3.0)
 RSQLite                  2.3.4      2023-12-08 [1] CRAN (R 4.3.0)
 rstudioapi               0.15.0     2023-07-07 [1] CRAN (R 4.3.0)
 rtracklayer              1.62.0     2023-10-24 [1] Bioconductor
 Rtsne                    0.17       2023-12-07 [1] CRAN (R 4.3.0)
 rvest                    1.0.3      2022-08-19 [1] CRAN (R 4.3.0)
 S4Arrays                 1.2.0      2023-10-24 [1] Bioconductor
 S4Vectors              * 0.40.2     2023-11-23 [1] Bioconductor
 scales                   1.3.0      2023-11-28 [1] CRAN (R 4.3.0)
 scattermore              1.2        2023-06-12 [1] CRAN (R 4.3.0)
 scLANE                 * 0.7.9      2024-02-18 [1] Bioconductor
 scRNAseq               * 2.16.0     2023-10-26 [1] Bioconductor
 sctransform              0.4.1      2023-10-19 [1] CRAN (R 4.3.0)
 sessioninfo              1.2.2      2021-12-06 [1] CRAN (R 4.3.0)
 Seurat                 * 5.0.1      2023-11-17 [1] CRAN (R 4.3.0)
 SeuratObject           * 5.0.1      2023-11-17 [1] CRAN (R 4.3.0)
 shiny                    1.8.0      2023-11-17 [1] CRAN (R 4.3.0)
 SingleCellExperiment   * 1.24.0     2023-10-24 [1] Bioconductor
 slingshot              * 2.10.0     2023-10-24 [1] Bioconductor
 snow                     0.4-4      2021-10-27 [1] CRAN (R 4.3.0)
 sp                     * 2.1-2      2023-11-26 [1] CRAN (R 4.3.0)
 spam                     2.10-0     2023-10-23 [1] CRAN (R 4.3.0)
 SparseArray              1.2.3      2023-12-25 [1] Bioconductor 3.18 (R 4.3.2)
 sparseMatrixStats        1.14.0     2023-10-24 [1] Bioconductor
 spatstat.data            3.0-3      2023-10-24 [1] CRAN (R 4.3.0)
 spatstat.explore         3.2-5      2023-10-22 [1] CRAN (R 4.3.0)
 spatstat.geom            3.2-7      2023-10-20 [1] CRAN (R 4.3.0)
 spatstat.random          3.2-2      2023-11-29 [1] CRAN (R 4.3.0)
 spatstat.sparse          3.0-3      2023-10-24 [1] CRAN (R 4.3.0)
 spatstat.utils           3.0-4      2023-10-24 [1] CRAN (R 4.3.0)
 statmod                  1.5.0      2023-01-06 [1] CRAN (R 4.3.0)
 stringi                  1.8.3      2023-12-11 [1] CRAN (R 4.3.0)
 stringr                  1.5.1      2023-11-14 [1] CRAN (R 4.3.0)
 SummarizedExperiment   * 1.32.0     2023-10-24 [1] Bioconductor
 survival                 3.5-7      2023-08-14 [1] CRAN (R 4.3.2)
 svglite                  2.1.3      2023-12-08 [1] CRAN (R 4.3.0)
 systemfonts              1.0.5      2023-10-09 [1] CRAN (R 4.3.0)
 tensor                   1.5        2012-05-05 [1] CRAN (R 4.3.0)
 tibble                   3.2.1      2023-03-20 [1] CRAN (R 4.3.0)
 tidyr                    1.3.1      2024-01-24 [1] CRAN (R 4.3.2)
 tidyselect               1.2.0      2022-10-10 [1] CRAN (R 4.3.0)
 TMB                      1.9.10     2023-12-12 [1] CRAN (R 4.3.0)
 TrajectoryUtils        * 1.10.0     2023-10-24 [1] Bioconductor
 utf8                     1.2.4      2023-10-22 [1] CRAN (R 4.3.0)
 uwot                     0.1.16     2023-06-29 [1] CRAN (R 4.3.0)
 vctrs                    0.6.5      2023-12-01 [1] CRAN (R 4.3.0)
 vipor                    0.4.7      2023-12-18 [1] CRAN (R 4.3.0)
 viridisLite              0.4.2      2023-05-02 [1] CRAN (R 4.3.0)
 webshot                  0.5.5      2023-06-26 [1] CRAN (R 4.3.0)
 withr                    2.5.2      2023-10-30 [1] CRAN (R 4.3.0)
 xfun                     0.41       2023-11-01 [1] CRAN (R 4.3.0)
 XML                      3.99-0.16  2023-11-29 [1] CRAN (R 4.3.0)
 xml2                     1.3.6      2023-12-04 [1] CRAN (R 4.3.0)
 xtable                   1.8-4      2019-04-21 [1] CRAN (R 4.3.0)
 XVector                  0.42.0     2023-10-24 [1] Bioconductor
 yaml                     2.3.8      2023-12-11 [1] CRAN (R 4.3.0)
 zlibbioc                 1.48.0     2023-10-24 [1] Bioconductor
 zoo                      1.8-12     2023-04-13 [1] CRAN (R 4.3.0)

 [1] /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/library

─ Python configuration ───────────────────────────────────────────────────────
 python:         /Users/jack/Desktop/PhD/Research/Python_Envs/personal_site/bin/python
 libpython:      /usr/local/opt/python@3.11/Frameworks/Python.framework/Versions/3.11/lib/python3.11/config-3.11-darwin/libpython3.11.dylib
 pythonhome:     /Users/jack/Desktop/PhD/Research/Python_Envs/personal_site:/Users/jack/Desktop/PhD/Research/Python_Envs/personal_site
 version:        3.11.6 (main, Nov  2 2023, 04:52:24) [Clang 14.0.3 (clang-1403.0.22.14.1)]
 numpy:          /Users/jack/Desktop/PhD/Research/Python_Envs/personal_site/lib/python3.11/site-packages/numpy
 numpy_version:  1.26.3
 leidenalg:      /Users/jack/Desktop/PhD/Research/Python_Envs/personal_site/lib/python3.11/site-packages/leidenalg
 
 NOTE: Python version was forced by use_python() function

──────────────────────────────────────────────────────────────────────────────

1 Introduction

2 Libraries

3 Color palettes

4 Data

5 Preprocessing

6 Analysis

6.1 Celltype annotation

6.2 Trajectory inference

6.2.1 Pseudotime estimation

6.2.2 Trajectory DE testing

6.2.3 Gene dynamics

6.2.4 Distribution of knots from scLANE

6.2.5 Gene-level embeddings

6.2.6 Gene set enrichment analysis

7 Conclusions

8 Session info

6.2.4 Distribution of knots from `scLANE`