|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- cxllin/medinstructv2 |
|
language: |
|
- en |
|
library_name: transformers |
|
pipeline_tag: question-answering |
|
tags: |
|
- medical |
|
--- |
|
|
|
|
|
`StableMed` is a 3 billion parameter decoder-only language model fine tuned on 18k rows of medical questions over 1 epoch. |
|
## Usage |
|
|
|
Get started generating text with `StableMed` by using the following code snippet: |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
tokenizer = AutoTokenizer.from_pretrained("cxllin/StableMed-3b") |
|
model = AutoModelForCausalLM.from_pretrained( |
|
"stabilityai/stablelm-3b-4e1t", |
|
trust_remote_code=True, |
|
torch_dtype="auto", |
|
) |
|
model.cuda() |
|
inputs = tokenizer("The weather is always wonderful", return_tensors="pt").to("cuda") |
|
tokens = model.generate( |
|
**inputs, |
|
max_new_tokens=64, |
|
temperature=0.75, |
|
top_p=0.95, |
|
do_sample=True, |
|
) |
|
print(tokenizer.decode(tokens[0], skip_special_tokens=True)) |
|
``` |
|
|
|
### Model Architecture |
|
|
|
The model is a decoder-only transformer similar to the LLaMA ([Touvron et al., 2023](https://arxiv.org/abs/2307.09288)) architecture with the following modifications: |
|
|
|
| Parameters | Hidden Size | Layers | Heads | Sequence Length | |
|
|----------------|-------------|--------|-------|-----------------| |
|
| 2,795,443,200 | 2560 | 32 | 32 | 4096 | |
|
|
|
* **Position Embeddings**: Rotary Position Embeddings ([Su et al., 2021](https://arxiv.org/abs/2104.09864)) applied to the first 25% of head embedding dimensions for improved throughput following [Black et al. (2022)](https://arxiv.org/pdf/2204.06745.pdf). |
|
* **Normalization**: LayerNorm ([Ba et al., 2016](https://arxiv.org/abs/1607.06450)) with learned bias terms as opposed to RMSNorm ([Zhang & Sennrich, 2019](https://arxiv.org/abs/1910.07467)). |
|
* **Tokenizer**: GPT-NeoX ([Black et al., 2022](https://arxiv.org/abs/2204.06745)). |
|
|