Spaces:
Sleeping
Sleeping
File size: 4,926 Bytes
2999286 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 |
#!/usr/bin/env python # coding: utf-8 # # Prediction of absolute developmental potential using CytoTrace2 # # CytoTRACE 2 is a computational method for predicting cellular potency categories and absolute developmental potential from single-cell RNA-sequencing data. # # Potency categories in the context of CytoTRACE 2 classify cells based on their developmental potential, ranging from totipotent and pluripotent cells with broad differentiation potential to lineage-restricted oligopotent, multipotent and unipotent cells capable of producing varying numbers of downstream cell types, and finally, differentiated cells, ranging from mature to terminally differentiated phenotypes. # # We made three improvements in integrating the CytoTrace2 algorithm in OmicVerse: # # - No additional packages to install, including R # - We fixed a bug in multi-threaded pools to avoid potential error reporting # - Native support for `anndata`, you don't need to export `input_file` and `annotation_file`. # # If you found this tutorial helpful, please cite CytoTrace2 and OmicVerse: # # Kang, M., Armenteros, J. J. A., Gulati, G. S., Gleyzer, R., Avagyan, S., Brown, E. L., Zhang, W., Usmani, A., Earland, N., Wu, Z., Zou, J., Fields, R. C., Chen, D. Y., Chaudhuri, A. A., & Newman, A. M. (2024). Mapping single-cell developmental potential in health and disease with interpretable deep learning. bioRxiv : the preprint server for biology, 2024.03.19.585637. https://doi.org/10.1101/2024.03.19.585637 # In[1]: import omicverse as ov ov.plot_set() # ## Preprocess data # # As an example, we apply differential kinetic analysis to dentate gyrus neurogenesis, which comprises multiple heterogeneous subpopulations. # In[2]: import scvelo as scv adata=scv.datasets.dentategyrus() adata # In[4]: get_ipython().run_cell_magic('time', '', "adata=ov.pp.preprocess(adata,mode='shiftlog|pearson',n_HVGs=2000,)\nadata\n") # ## Predict cytotrace2 # # We need to import the two pre-trained models from CytoTrace2, see the download links for the models: # # - Figshare: # https://figshare.com/ndownloader/files/47258749 # # - or Github: # https://github.com/digitalcytometry/cytotrace2/tree/main/cytotrace2_python/cytotrace2_py/resources/17_models_weights # https://github.com/digitalcytometry/cytotrace2/tree/main/cytotrace2_python/cytotrace2_py/resources/5_models_weights # # All parameters are explained as follows: # - adata: AnnData object containing the scRNA-seq data. # - use_model_dir: Path to the directory containing the pre-trained model files. # - species: The species of the input data. Default is "mouse". # - batch_size: The number of cells to process in each batch. Default is 10000. # - smooth_batch_size: The number of cells to process in each batch for smoothing. Default is 1000. # - disable_parallelization: If True, disable parallel processing. Default is False. # - max_cores: Maximum number of CPU cores to use for parallel processing. If None, all available cores will be used. Default is None. # - max_pcs: Maximum number of principal components to use. Default is 200. # - seed: Random seed for reproducibility. Default is 14. # - output_dir: Directory to save the results. Default is 'cytotrace2_results'. # In[5]: results = ov.single.cytotrace2(adata, use_model_dir="cymodels/5_models_weights", species="mouse", batch_size = 10000, smooth_batch_size = 1000, disable_parallelization = False, max_cores = None, max_pcs = 200, seed = 14, output_dir = 'cytotrace2_results' ) # ## Visualizing # # Visualizing the results we can directly compare the predicted potency scores with the known developmental stage of the cells, seeing how the predictions meticulously align with the known biology. Take a look! # In[8]: ov.utils.embedding(adata,basis='X_umap', color=['clusters','CytoTRACE2_Score'], frameon='small',cmap='Reds',wspace=0.55) # - Left: demonstrates the distribution of different cell types in UMAP space. # - Right: demonstrates the CytoTRACE 2 scores of different cell types; cells with high scores are generally considered to have a higher pluripotency or undifferentiated state. # In[9]: ov.utils.embedding(adata,basis='X_umap', color=['CytoTRACE2_Potency','CytoTRACE2_Relative'], frameon='small',cmap='Reds',wspace=0.55) # - Potency category: # The UMAP embedding plot of predicted potency category reflects the discrete classification of cells into potency categories, taking possible values of Differentiated, Unipotent, Oligopotent, Multipotent, Pluripotent, and Totipotent. # - Relative order: # UMAP embedding of predicted relative order, which is based on absolute predicted potency scores normalized to the range 0 (more differentiated) to 1 (less differentiated). Provides the relative ordering of cells by developmental potential |