File size: 5,905 Bytes
6f5af7e
 
 
 
 
 
 
 
 
 
 
e9e21ff
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
93107d9
 
e9e21ff
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
93107d9
e9e21ff
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
93107d9
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
---
license: mit
language:
- en
metrics:
- accuracy
pipeline_tag: graph-ml
tags:
- chemistry
- biology
- medical
---
# PLA-Net

## Model Details

### Model Description

**PLA-Net** is a deep learning model designed to predict interactions between small organic molecules (ligands) and any of the 102 target proteins in the Alzheimer's Disease (AD) dataset. By transforming molecular and protein sequences into graph representations, PLA-Net leverages Graph Convolutional Networks (GCNs) to analyze and predict target-ligand interaction probabilities. Developed by [BCV-Uniandes](https://github.com/BCV-Uniandes/PLA-Net).

## Key Features

- **Graph-Based Input Representation**
  - **Ligand Module (LM):** Converts SMILES sequences of molecules into graph representations.
  - **Protein Module (PM):** Transforms FASTA sequences of proteins into graph structures.

- **Deep Graph Convolutional Networks**
  - Each module employs a deep GCN followed by an average pooling layer to extract meaningful features from the input graphs.

- **Interaction Prediction**
  - The feature representations from the LM and PM are concatenated.
  - A fully connected layer processes the combined features to predict the interaction probability between the ligand and the target protein.

- **Developed by:** [BCV-Uniandes](https://github.com/BCV-Uniandes/PLA-Net).
- **Model type:** GCNs, Graph Convolutional Networks
- **Language(s) (NLP):** Python
- **License:** MIT

### Model Sources

- **Repository Fork:** [https://github.com/juliocesar-io/PLA-Net](https://github.com/juliocesar-io/PLA-Net)
- **Repository Official:** [https://github.com/BCV-Uniandes/PLA-Net](https://github.com/BCV-Uniandes/PLA-Net)
- **Paper:**  [https://www.nature.com/articles/s41598-022-12180-x](https://www.nature.com/articles/s41598-022-12180-x)
- **Demo:** [https://huggingface.co/spaces/juliocesar-io/PLA-Net](https://huggingface.co/spaces/juliocesar-io/PLA-Net)

## Docker Install

To prevent conflicts with the host machine, it is recommended to run PLA-Net in a Docker container.

First make sure you have an NVIDIA GPU and [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html) installed. Then build the image with the following command:

```bash
docker build -t pla-net:latest .
```

### Inference

To run inference, run the following command:

```bash
docker run \
    -it --rm --gpus all \
    -v "$(pwd)":/home/user/output \
    pla-net:latest \
    python /home/user/app/scripts/pla_net_inference.py \
    --use_gpu \
    --target ada \
    --target_list /home/user/app/data/datasets/AD/Targets_Fasta.csv \
    --target_checkpoint_path /home/user/app/pretrained-models/BINARY_ada \
    --input_file_smiles /home/user/app/example/input_smiles.csv \
    --output_file /home/user/output/output_predictions.csv
```


This will run inference for the target protein `ada` with the SMILES in the `input_smiles.csv` file and save the predictions to the `output_predictions.csv` file.

The prediction file has the following format:

```bash
target,smiles,interaction_probability,interaction_class
ada,Cn4c(CCC(=O)Nc3ccc2ccn(CC[C@H](CO)n1cnc(C(N)=O)c1)c2c3)nc5ccccc45,0.9994347542524338,1
```

Where `interaction_class` is 1 if the interaction probability is greater than 0.5, and 0 otherwise.


*Inference Args:*

- `use_gpu`: Use GPU for inference.
- `target`: Target protein ID from the list of targets. Check the list of available targets in the [data](https://github.com/juliocesar-io/PLA-Net/blob/main/data/datasets/AD/Targets_Fasta.csv) folder.
- `target_list`: Path to the target list CSV file.
- `target_checkpoint_path`: Path to the target checkpoint. (e.g. `/workspace/pretrained-models/BINARY_ada`) one checkpoint for each target.
- `input_file_smiles`: Path to the input SMILES file.
- `output_file`: Path to the output predictions file.


### Gradio Server
We provide a simple graphical user interface to run PLA-Net with Gradio. To use it, run the following command:

```bash
docker run \
    -it --rm --gpus all \
    -p 7860:7860 \
    pla-net:latest \
    python app.py
```

Then open your browser and go to `http://localhost:7860/` to access the web interface. 

    
## Local Install

To do inference with PLA-Net, you need to install the dependencies and activate the environment. You can do this by running the following commands:

```bash
conda env create -f environment.yml
conda activate pla-net
```

Now you can run inference with PLA-Net locally. In the project folder, run the following command:

```bash
python scripts/pla_net_inference.py \
    --use_gpu \
    --target ada \
    --target_list data/datasets/AD/Targets_Fasta.csv \
    --target_checkpoint_path pretrained-models/BINARY_ada \
    --input_file_smiles example/input_smiles.csv \
    --output_file example/output_predictions.csv
```

## Models
You can download the pre-trained models from [Hugging Face](https://huggingface.co/juliocesar-io/PLA-Net).

## Training 

To train each of the components of our method: LM, LM+Advs, LMPM and PLA-Net please refer to planet.sh file and run the desired models.

To evaluate each of the components of our method: LM, LM+Advs, LMPM and PLA-Net please run the corresponding bash file in the inference folder.


## Citation

**BibTeX:**

```
@article{ruiz2022predicting,
  title={Predicting target--ligand interactions with graph convolutional networks for interpretable pharmaceutical discovery},
  author={Ruiz Puentes, Paola and Rueda-Gensini, Laura and Valderrama, Natalia and Hern{\'a}ndez, Isabela and Gonz{\'a}lez, Cristina and Daza, Laura and Mu{\~n}oz-Camargo, Carolina and Cruz, Juan C and Arbel{\'a}ez, Pablo},
  journal={Scientific reports},
  volume={12},
  number={1},
  pages={1--17},
  year={2022},
  publisher={Nature Publishing Group}
}
```

## Model Card Authors

- [Julio César](https://juliocesar.io/) / Contact: [email protected]