moshe-raboh commited on
Commit
fc4233c
1 Parent(s): 4c8d3fa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +91 -4
README.md CHANGED
@@ -1,8 +1,95 @@
1
  ---
2
  tags:
3
- - model_hub_mixin
 
 
 
 
 
 
 
 
 
 
4
  ---
5
 
6
- This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
7
- - Library: [More Information Needed]
8
- - Docs: [More Information Needed]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  tags:
3
+ - protein
4
+ - small-molecule
5
+ - dti
6
+ - ibm
7
+ - mammal
8
+ - pytorch
9
+ - transformers
10
+ library_name: biomed
11
+ license: apache-2.0
12
+ base_model:
13
+ - ibm/biomed.omics.bl.sm.ma-ted-400m
14
  ---
15
 
16
+ Accurate prediction of drug-target binding affinity is essential in the early stages of drug discovery.
17
+ This is an example of finetuning ibm/biomed.omics.bl.sm-ted-400 the task.
18
+ Prediction of binding affinities using pKd, the negative logarithm of the dissociation constant, which reflects the strength of the interaction between a small molecule (drug) and a protein (target).
19
+ The expected inputs for the model are the amino acid sequence of the target and the SMILES representation of the drug.
20
+
21
+ The benchmark used for fine-tuning defined on: `https://tdcommons.ai/multi_pred_tasks/dti/`
22
+ We also harmonize the values using data.harmonize_affinities(mode = 'max_affinity') and transforming to log-scale.
23
+ By default, we are using Drug+Target cold-split, as provided by tdcommons.
24
+
25
+
26
+ ## Model Summary
27
+
28
+ - **Developers:** IBM Research
29
+ - **GitHub Repository:** https://github.com/BiomedSciAI/biomed-multi-alignment
30
+ - **Paper:** TBD
31
+ - **Release Date**: Oct 28th, 2024
32
+ - **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0).
33
+
34
+ ## Usage
35
+
36
+ Using `ibm/biomed.omics.bl.sm.ma-ted-400m` requires installing [https://github.com/BiomedSciAI/biomed-multi-alignment](https://github.com/TBD)
37
+
38
+ ```
39
+ pip install git+https://github.com/BiomedSciAI/biomed-multi-alignment.git
40
+ ```
41
+
42
+ A simple example for a task already supported by `ibm/biomed.omics.bl.sm.ma-ted-400m`:
43
+ ```python
44
+
45
+
46
+ # Load Model
47
+ model = Mammal.from_pretrained("ibm/biomed.omics.bl.sm.ma-ted-400m.dti_bindingdb_pkd")
48
+
49
+ # Load Tokenizer
50
+ tokenizer_op = ModularTokenizerOp.from_pretrained("ibm/biomed.omics.bl.sm.ma-ted-400m.dti_bindingdb_pkd")
51
+
52
+ # convert to MAMMAL style
53
+ sample_dict = {"target_seq": target_seq, "drug_seq": drug_seq}
54
+ sample_dict = DtiBindingdbKdTask.data_preprocessing(
55
+ sample_dict=sample_dict,
56
+ tokenizer_op=tokenizer_op,
57
+ target_sequence_key="target_seq",
58
+ drug_sequence_key="drug_seq",
59
+ norm_y_mean=None,
60
+ norm_y_std=None,
61
+ device=nn_model.device,
62
+ )
63
+
64
+ # forward pass - encoder_only mode which supports scalars predictions
65
+ batch_dict = nn_model.forward_encoder_only([sample_dict])
66
+
67
+ # Post-process the model's output
68
+ batch_dict = DtiBindingdbKdTask.process_model_output(
69
+ batch_dict,
70
+ scalars_preds_processed_key="model.out.dti_bindingdb_kd",
71
+ norm_y_mean=norm_y_mean,
72
+ norm_y_std=norm_y_std,
73
+ )
74
+ ans = {
75
+ "model.out.dti_bindingdb_kd": float(batch_dict["model.out.dti_bindingdb_kd"][0])
76
+ }
77
+
78
+ # Print prediction
79
+ print(f"{ans=}")
80
+ ```
81
+
82
+ For more advanced usage, see our detailed example at: on `https://github.com/BiomedSciAI/biomed-multi-alignment`
83
+
84
+
85
+ ## Citation
86
+
87
+ If you found our work useful, please consider to give a star to the repo and cite our paper:
88
+ ```
89
+ @article{TBD,
90
+ title={TBD},
91
+ author={IBM Research Team},
92
+ jounal={arXiv preprint arXiv:TBD},
93
+ year={2024}
94
+ }
95
+ ```