jozhang97 commited on
Commit
247051d
1 Parent(s): fc39571

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -0
README.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # ISM
2
+
3
+ By [Jeffrey Ouyang-Zhang](https://jozhang97.github.io/), [Chengyue Gong](https://sites.google.com/view/chengyue-gong), [Yue Zhao](https://zhaoyue-zephyrus.github.io), [Philipp Krähenbühl](http://www.philkr.net/), [Adam Klivans](https://www.cs.utexas.edu/users/klivans/), [Daniel J. Diaz](http://danny305.github.io)
4
+
5
+ This repository contains the model presented in the paper [Distilling Structural Representations into Protein Sequence Models](https://www.biorxiv.org/content/10.1101/2024.11.08.622579v1).
6
+ The official github can be found at https://github.com/jozhang97/ism.
7
+
8
+ **TL; DR.** ESM2 with enriched structural representations
9
+
10
+ ## Quickstart
11
+
12
+ This quickstart assumes that the user is already working with ESM2 and is interested in replacing ESM with ISM. First, download ISM.
13
+ ```bash
14
+ # recommended
15
+ huggingface-cli download jozhang97/ism_t33_650M_uc30pdb --local-dir /path/to/save/ism
16
+
17
+ # alternative
18
+ git clone https://huggingface.co/jozhang97/ism_t33_650M_uc30pdb
19
+ ```
20
+
21
+ If the user is starting from [fair-esm](https://github.com/facebookresearch/esm), add the following lines of code.
22
+ ```python
23
+ import esm
24
+ model, alphabet = esm.pretrained.esm2_t33_650M_UR50D()
25
+ ckpt = torch.load('/path/to/ism_t33_650M_uc30pdb/checkpoint.pth')
26
+ model.load_state_dict(ckpt)
27
+ ```
28
+
29
+ If the user is starting from [huggingface](https://huggingface.co/facebook/esm2_t33_650M_UR50D), replace the model and tokenizer with the following line of code.
30
+ ```python
31
+ from transformers import AutoTokenizer, AutoModel
32
+ config_path = "/path/to/ism_t33_650M_uc30pdb/"
33
+ model = AutoModel.from_pretrained(config_path)
34
+ tokenizer = AutoTokenizer.from_pretrained(config_path)
35
+ ```
36
+
37
+ Please change `/path/to/ism_t33_650M_uc30pdb` to the path where the model is downloaded.
38
+
39
+ ## Citing ISM
40
+ If you find ISM useful in your research, please consider citing:
41
+
42
+ ```bibtex
43
+ @article{ouyangzhang2024distilling,
44
+ title={Distilling Structural Representations into Protein Sequence Models},
45
+ author={Ouyang-Zhang, Jeffrey and Gong, Chengyue and Zhao, Yue and Kr{\"a}henb{\"u}hl, Philipp and Klivans, Adam and Diaz, Daniel J},
46
+ journal={bioRxiv},
47
+ doi={10.1101/2024.11.08.622579},
48
+ year={2024},
49
+ publisher={Cold Spring Harbor Laboratory}
50
+ }
51
+ ```