tmbj-aidd
/

aptagpt-bcma

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions

tmbj-aidd commited on May 20, 2024

Commit

b58b9fd

·

verified ·

1 Parent(s): 35ed63c

Update README.md

Files changed (1) hide show

README.md +45 -3

README.md CHANGED Viewed

@@ -1,3 +1,45 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+library_name: transformers
+pipeline_tag: text-generation
+tags:
+- biology
+- text-generation-inference
+---
+## AptaGPT
+AptaGPT is a generative pre-trained language model for aptamer design. The model focuses on the generation of a new sequence space of aptamers, trained and fine-tuned using the third and sixth round of SELEX data for B cell maturation antigen (BCMA).
+## Dataset
+AptaGPT was pre-trained using a large dataset consisting of 108,229,900 sequences from the third round of the SELEX process targeting BCMA. This extensive dataset provided a robust foundation for learning generalized patterns in aptamer sequences. For fine-tuning, the model utilized 9,350 sequences from the sixth round of SELEX. All aptamer sequences used for both pre-training and fine-tuning are 35 nucleotides in length.
+## Requirements
+Before running the AptaGPT model, the following Python dependencies need to be installed:
+```bash
+pip install transformers sentencepiece
+```
+## Usage Examples
+To load the model form hugging face:
+```python
+from transformers import pipeline
+model = pipeline('text-generation', model="tmbj-aidd/aptagpt-bcma")
+```
+To generate aptamer sequences:
+```python
+sequences = model("<|endoftext|>",
+                max_length=15,
+                do_sample=True,
+                top_k=700,
+                repetition_penalty=1.2,
+                num_return_sequences=10,
+                )
+print(sequences)
+```