KeTuTu's picture
Upload 46 files
2999286 verified
#!/usr/bin/env python
# coding: utf-8
# # Prediction of absolute developmental potential using CytoTrace2
#
# CytoTRACE 2 is a computational method for predicting cellular potency categories and absolute developmental potential from single-cell RNA-sequencing data.
#
# Potency categories in the context of CytoTRACE 2 classify cells based on their developmental potential, ranging from totipotent and pluripotent cells with broad differentiation potential to lineage-restricted oligopotent, multipotent and unipotent cells capable of producing varying numbers of downstream cell types, and finally, differentiated cells, ranging from mature to terminally differentiated phenotypes.
#
# We made three improvements in integrating the CytoTrace2 algorithm in OmicVerse:
#
# - No additional packages to install, including R
# - We fixed a bug in multi-threaded pools to avoid potential error reporting
# - Native support for `anndata`, you don't need to export `input_file` and `annotation_file`.
#
# If you found this tutorial helpful, please cite CytoTrace2 and OmicVerse:
#
# Kang, M., Armenteros, J. J. A., Gulati, G. S., Gleyzer, R., Avagyan, S., Brown, E. L., Zhang, W., Usmani, A., Earland, N., Wu, Z., Zou, J., Fields, R. C., Chen, D. Y., Chaudhuri, A. A., & Newman, A. M. (2024). Mapping single-cell developmental potential in health and disease with interpretable deep learning. bioRxiv : the preprint server for biology, 2024.03.19.585637. https://doi.org/10.1101/2024.03.19.585637
# In[1]:
import omicverse as ov
ov.plot_set()
# ## Preprocess data
#
# As an example, we apply differential kinetic analysis to dentate gyrus neurogenesis, which comprises multiple heterogeneous subpopulations.
# In[2]:
import scvelo as scv
adata=scv.datasets.dentategyrus()
adata
# In[4]:
get_ipython().run_cell_magic('time', '', "adata=ov.pp.preprocess(adata,mode='shiftlog|pearson',n_HVGs=2000,)\nadata\n")
# ## Predict cytotrace2
#
# We need to import the two pre-trained models from CytoTrace2, see the download links for the models:
#
# - Figshare:
# https://figshare.com/ndownloader/files/47258749
#
# - or Github:
# https://github.com/digitalcytometry/cytotrace2/tree/main/cytotrace2_python/cytotrace2_py/resources/17_models_weights
# https://github.com/digitalcytometry/cytotrace2/tree/main/cytotrace2_python/cytotrace2_py/resources/5_models_weights
#
# All parameters are explained as follows:
# - adata: AnnData object containing the scRNA-seq data.
# - use_model_dir: Path to the directory containing the pre-trained model files.
# - species: The species of the input data. Default is "mouse".
# - batch_size: The number of cells to process in each batch. Default is 10000.
# - smooth_batch_size: The number of cells to process in each batch for smoothing. Default is 1000.
# - disable_parallelization: If True, disable parallel processing. Default is False.
# - max_cores: Maximum number of CPU cores to use for parallel processing. If None, all available cores will be used. Default is None.
# - max_pcs: Maximum number of principal components to use. Default is 200.
# - seed: Random seed for reproducibility. Default is 14.
# - output_dir: Directory to save the results. Default is 'cytotrace2_results'.
# In[5]:
results = ov.single.cytotrace2(adata,
use_model_dir="cymodels/5_models_weights",
species="mouse",
batch_size = 10000,
smooth_batch_size = 1000,
disable_parallelization = False,
max_cores = None,
max_pcs = 200,
seed = 14,
output_dir = 'cytotrace2_results'
)
# ## Visualizing
#
# Visualizing the results we can directly compare the predicted potency scores with the known developmental stage of the cells, seeing how the predictions meticulously align with the known biology. Take a look!
# In[8]:
ov.utils.embedding(adata,basis='X_umap',
color=['clusters','CytoTRACE2_Score'],
frameon='small',cmap='Reds',wspace=0.55)
# - Left: demonstrates the distribution of different cell types in UMAP space.
# - Right: demonstrates the CytoTRACE 2 scores of different cell types; cells with high scores are generally considered to have a higher pluripotency or undifferentiated state.
# In[9]:
ov.utils.embedding(adata,basis='X_umap',
color=['CytoTRACE2_Potency','CytoTRACE2_Relative'],
frameon='small',cmap='Reds',wspace=0.55)
# - Potency category:
# The UMAP embedding plot of predicted potency category reflects the discrete classification of cells into potency categories, taking possible values of Differentiated, Unipotent, Oligopotent, Multipotent, Pluripotent, and Totipotent.
# - Relative order:
# UMAP embedding of predicted relative order, which is based on absolute predicted potency scores normalized to the range 0 (more differentiated) to 1 (less differentiated). Provides the relative ordering of cells by developmental potential