|
--- |
|
library_name: peft |
|
base_model: mistralai/Mistral-7B-v0.1 |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
Trained with [Ludwig.ai](https://ludwig.ai) and [Prdibase](https://predibase.com)! |
|
|
|
Given a passage from a news report generates a headline. |
|
|
|
Trained on: https://huggingface.co/datasets/JulesBelveze/tldr_news |
|
|
|
|
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
Ludwig config (v0.9.3): |
|
|
|
```yaml |
|
model_type: llm |
|
input_features: |
|
- name: prompt |
|
type: text |
|
preprocessing: |
|
max_sequence_length: null |
|
column: prompt |
|
output_features: |
|
- name: headline |
|
type: text |
|
preprocessing: |
|
max_sequence_length: null |
|
column: headline |
|
prompt: |
|
template: >- |
|
The following passage is content from a news report. Please summarize this |
|
passage in one sentence or less. |
|
|
|
|
|
Passage: {content} |
|
|
|
|
|
Summary: |
|
preprocessing: |
|
split: |
|
type: fixed |
|
column: split |
|
global_max_sequence_length: 2048 |
|
adapter: |
|
type: lora |
|
generation: |
|
max_new_tokens: 64 |
|
trainer: |
|
type: finetune |
|
epochs: 3 |
|
optimizer: |
|
type: paged_adam |
|
batch_size: 1 |
|
eval_steps: 100 |
|
learning_rate: 0.0002 |
|
eval_batch_size: 2 |
|
steps_per_checkpoint: 1000 |
|
learning_rate_scheduler: |
|
decay: cosine |
|
warmup_fraction: 0.03 |
|
gradient_accumulation_steps: 16 |
|
enable_gradient_checkpointing: true |
|
base_model: mistralai/Mistral-7B-v0.1 |
|
quantization: |
|
bits: 4 |
|
``` |
|
|