Spaces:

KeTuTu
/

OV_Agentic_EXP_SambaNova

Sleeping

File size: 4,926 Bytes
#!/usr/bin/env python
# coding: utf-8

# # Prediction of absolute developmental potential using CytoTrace2
# 
# CytoTRACE 2 is a computational method for predicting cellular potency categories and absolute developmental potential from single-cell RNA-sequencing data.
# 
# Potency categories in the context of CytoTRACE 2 classify cells based on their developmental potential, ranging from totipotent and pluripotent cells with broad differentiation potential to lineage-restricted oligopotent, multipotent and unipotent cells capable of producing varying numbers of downstream cell types, and finally, differentiated cells, ranging from mature to terminally differentiated phenotypes.
# 
# We made three improvements in integrating the CytoTrace2 algorithm in OmicVerse:
# 
# - No additional packages to install, including R
# - We fixed a bug in multi-threaded pools to avoid potential error reporting
# - Native support for `anndata`, you don't need to export `input_file` and `annotation_file`.
# 
# If you found this tutorial helpful, please cite CytoTrace2 and OmicVerse:
# 
# Kang, M., Armenteros, J. J. A., Gulati, G. S., Gleyzer, R., Avagyan, S., Brown, E. L., Zhang, W., Usmani, A., Earland, N., Wu, Z., Zou, J., Fields, R. C., Chen, D. Y., Chaudhuri, A. A., & Newman, A. M. (2024). Mapping single-cell developmental potential in health and disease with interpretable deep learning. bioRxiv : the preprint server for biology, 2024.03.19.585637. https://doi.org/10.1101/2024.03.19.585637

# In[1]:


import omicverse as ov
ov.plot_set()


# ## Preprocess data
# 
# As an example, we apply differential kinetic analysis to dentate gyrus neurogenesis, which comprises multiple heterogeneous subpopulations.

# In[2]:


import scvelo as scv
adata=scv.datasets.dentategyrus()
adata


# In[4]:


get_ipython().run_cell_magic('time', '', "adata=ov.pp.preprocess(adata,mode='shiftlog|pearson',n_HVGs=2000,)\nadata\n")


# ## Predict cytotrace2
# 
# We need to import the two pre-trained models from CytoTrace2, see the download links for the models:
# 
# - Figshare:
#   https://figshare.com/ndownloader/files/47258749
# 
# - or Github:
#     https://github.com/digitalcytometry/cytotrace2/tree/main/cytotrace2_python/cytotrace2_py/resources/17_models_weights
#     https://github.com/digitalcytometry/cytotrace2/tree/main/cytotrace2_python/cytotrace2_py/resources/5_models_weights
# 
# All parameters are explained as follows:
# - adata: AnnData object containing the scRNA-seq data.
# - use_model_dir: Path to the directory containing the pre-trained model files.
# - species: The species of the input data. Default is "mouse".
# - batch_size: The number of cells to process in each batch. Default is 10000.
# - smooth_batch_size: The number of cells to process in each batch for smoothing. Default is 1000.
# - disable_parallelization: If True, disable parallel processing. Default is False.
# - max_cores: Maximum number of CPU cores to use for parallel processing. If None, all available cores will be used. Default is None.
# - max_pcs: Maximum number of principal components to use. Default is 200.
# - seed: Random seed for reproducibility. Default is 14.
# - output_dir: Directory to save the results. Default is 'cytotrace2_results'.

# In[5]:


results =  ov.single.cytotrace2(adata,
    use_model_dir="cymodels/5_models_weights",
    species="mouse",
    batch_size = 10000,
    smooth_batch_size = 1000,
    disable_parallelization = False,
    max_cores = None,
    max_pcs = 200,
    seed = 14,
    output_dir = 'cytotrace2_results'
)


# ## Visualizing
# 
# Visualizing the results we can directly compare the predicted potency scores with the known developmental stage of the cells, seeing how the predictions meticulously align with the known biology. Take a look!

# In[8]:


ov.utils.embedding(adata,basis='X_umap',
                   color=['clusters','CytoTRACE2_Score'],
                   frameon='small',cmap='Reds',wspace=0.55)


# - Left: demonstrates the distribution of different cell types in UMAP space.
# - Right: demonstrates the CytoTRACE 2 scores of different cell types; cells with high scores are generally considered to have a higher pluripotency or undifferentiated state.

# In[9]:


ov.utils.embedding(adata,basis='X_umap',
                   color=['CytoTRACE2_Potency','CytoTRACE2_Relative'],
                   frameon='small',cmap='Reds',wspace=0.55)


# - Potency category:
#     The UMAP embedding plot of predicted potency category reflects the discrete classification of cells into potency categories, taking possible values of Differentiated, Unipotent, Oligopotent, Multipotent, Pluripotent, and Totipotent.
# - Relative order:
#     UMAP embedding of predicted relative order, which is based on absolute predicted potency scores normalized to the range 0 (more differentiated) to 1 (less differentiated). Provides the relative ordering of cells by developmental potential