Spaces:
Sleeping
Sleeping
#!/usr/bin/env python | |
# coding: utf-8 | |
# # Trajectory Inference with PAGA or Palantir | |
# | |
# Diffusion maps were introduced by Ronald Coifman and Stephane Lafon, and the underlying idea is to assume that the data are samples from a diffusion process. | |
# | |
# Palantir is an algorithm to align cells along differentiation trajectories. Palantir models differentiation as a stochastic process where stem cells differentiate to terminally differentiated cells by a series of steps through a low dimensional phenotypic manifold. Palantir effectively captures the continuity in cell states and the stochasticity in cell fate determination. | |
# | |
# Note that both methods require the input of cells in their initial state, and we will introduce other methods that do not require the input of artificial information, such as pyVIA, in subsequent analyses. | |
# | |
# | |
# ## Preprocess data | |
# | |
# As an example, we apply differential kinetic analysis to dentate gyrus neurogenesis, which comprises multiple heterogeneous subpopulations. | |
# In[1]: | |
import scanpy as sc | |
import scvelo as scv | |
import matplotlib.pyplot as plt | |
import omicverse as ov | |
ov.plot_set() | |
# In[2]: | |
import scvelo as scv | |
adata=scv.datasets.dentategyrus() | |
adata | |
# In[3]: | |
adata=ov.pp.preprocess(adata,mode='shiftlog|pearson',n_HVGs=3000,) | |
adata.raw = adata | |
adata = adata[:, adata.var.highly_variable_features] | |
ov.pp.scale(adata) | |
ov.pp.pca(adata,layer='scaled',n_pcs=50) | |
# Let us inspect the contribution of single PCs to the total variance in the data. This gives us information about how many PCs we should consider in order to compute the neighborhood relations of cells. In our experience, often a rough estimate of the number of PCs does fine. | |
# In[4]: | |
ov.utils.plot_pca_variance_ratio(adata) | |
# ## Trajectory inference with diffusion map | |
# | |
# Here, we used `ov.single.TrajInfer` to construct a Trajectory Inference object. | |
# In[5]: | |
Traj=ov.single.TrajInfer(adata,basis='X_umap',groupby='clusters', | |
use_rep='scaled|original|X_pca',n_comps=50,) | |
Traj.set_origin_cells('nIPC') | |
# In[6]: | |
Traj.inference(method='diffusion_map') | |
# In[7]: | |
ov.utils.embedding(adata,basis='X_umap', | |
color=['clusters','dpt_pseudotime'], | |
frameon='small',cmap='Reds') | |
# PAGA graph abstraction has benchmarked as top-performing method for trajectory inference. It provides a graph-like map of the data topology with weighted edges corresponding to the connectivity between two clusters. | |
# | |
# Here, PAGA is extended by neighbor directionality. | |
# In[8]: | |
ov.utils.cal_paga(adata,use_time_prior='dpt_pseudotime',vkey='paga', | |
groups='clusters') | |
# In[9]: | |
ov.utils.plot_paga(adata,basis='umap', size=50, alpha=.1,title='PAGA LTNN-graph', | |
min_edge_width=2, node_size_scale=1.5,show=False,legend_loc=False) | |
# ## Trajectory inference with Slingshot | |
# | |
# Provides functions for inferring continuous, branching lineage structures in low-dimensional data. Slingshot was designed to model developmental trajectories in single-cell RNA sequencing data and serve as a component in an analysis pipeline after dimensionality reduction and clustering. It is flexible enough to handle arbitrarily many branching events and allows for the incorporation of prior knowledge through supervised graph construction. | |
# In[10]: | |
Traj=ov.single.TrajInfer(adata,basis='X_umap',groupby='clusters', | |
use_rep='scaled|original|X_pca',n_comps=50) | |
Traj.set_origin_cells('nIPC') | |
#Traj.set_terminal_cells(["Granule mature","OL","Astrocytes"]) | |
# If you only need the proposed timing and not the lineage of the process, then you can leave the debug_axes parameter unset. | |
# In[ ]: | |
Traj.inference(method='slingshot',num_epochs=1) | |
# else, you can set `debug_axes` to visualize the lineage | |
# In[13]: | |
fig, axes = plt.subplots(nrows=2, ncols=2, figsize=(8, 8)) | |
Traj.inference(method='slingshot',num_epochs=1,debug_axes=axes) | |
# In[14]: | |
ov.utils.embedding(adata,basis='X_umap', | |
color=['clusters','slingshot_pseudotime'], | |
frameon='small',cmap='Reds') | |
# In[15]: | |
sc.pp.neighbors(adata,use_rep='scaled|original|X_pca') | |
ov.utils.cal_paga(adata,use_time_prior='slingshot_pseudotime',vkey='paga', | |
groups='clusters') | |
# In[16]: | |
ov.utils.plot_paga(adata,basis='umap', size=50, alpha=.1,title='PAGA Slingshot-graph', | |
min_edge_width=2, node_size_scale=1.5,show=False,legend_loc=False) | |
# ## Trajectory inference with Palantir | |
# | |
# Palantir can be run by specifying an approxiate early cell. | |
# | |
# Palantir can automatically determine the terminal states as well. In this dataset, we know the terminal states and we will set them using the terminal_states parameter | |
# | |
# Here, we used `ov.single.TrajInfer` to construct a Trajectory Inference object. | |
# In[17]: | |
Traj=ov.single.TrajInfer(adata,basis='X_umap',groupby='clusters', | |
use_rep='scaled|original|X_pca',n_comps=50) | |
Traj.set_origin_cells('nIPC') | |
Traj.set_terminal_cells(["Granule mature","OL","Astrocytes"]) | |
# In[18]: | |
Traj.inference(method='palantir',num_waypoints=500) | |
# Palantir results can be visualized on the tSNE or UMAP using the plot_palantir_results function | |
# In[19]: | |
Traj.palantir_plot_pseudotime(embedding_basis='X_umap',cmap='RdBu_r',s=3) | |
# Once the cells are selected, it's often helpful to visualize the selection on the pseudotime trajectory to ensure we've isolated the correct cells for our specific trend. We can do this using the plot_branch_selection function: | |
# In[20]: | |
Traj.palantir_cal_branch(eps=0) | |
# In[22]: | |
ov.externel.palantir.plot.plot_trajectory(adata, "Granule mature", | |
cell_color="palantir_entropy", | |
n_arrows=10, | |
color="red", | |
scanpy_kwargs=dict(cmap="RdBu_r"), | |
) | |
# Palantir uses Mellon Function Estimator to determine the gene expression trends along different lineages. The marker trends can be determined using the following snippet. This computes the trends for all lineages. A subset of lineages can be used using the lineages parameter. | |
# In[23]: | |
gene_trends = Traj.palantir_cal_gene_trends( | |
layers="MAGIC_imputed_data", | |
) | |
# In[24]: | |
genes = ['Cdca3','Rasl10a','Mog','Aqp4'] | |
Traj.palantir_plot_gene_trends(genes) | |
plt.show() | |
# We can also use paga to visualize the cell stages | |
# In[25]: | |
ov.utils.cal_paga(adata,use_time_prior='palantir_pseudotime',vkey='paga', | |
groups='clusters') | |
# In[26]: | |
ov.utils.plot_paga(adata,basis='umap', size=50, alpha=.1,title='PAGA LTNN-graph', | |
min_edge_width=2, node_size_scale=1.5,show=False,legend_loc=False) | |
# In[ ]: | |