evan-nexusflow
commited on
Commit
•
22df17b
1
Parent(s):
242c9fd
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,26 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
### Usage
|
2 |
|
3 |
```python
|
@@ -92,4 +115,16 @@ messages = [
|
|
92 |
print(pipe([messages])) # Print the reward!
|
93 |
|
94 |
|
95 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: other
|
3 |
+
language:
|
4 |
+
- en
|
5 |
+
library_name: transformers
|
6 |
+
tags:
|
7 |
+
- RLHF
|
8 |
+
- Nexusflow
|
9 |
+
- Athene
|
10 |
+
- Reward Model
|
11 |
+
---
|
12 |
+
|
13 |
+
# Llama3-Athene-RM-70B
|
14 |
+
|
15 |
+
We introduce Llama3-Athene-RM-70B, an open-weights reward model based off Llama-3-70B-Instruct.
|
16 |
+
|
17 |
+
- **Developed by:** The Nexusflow Team (Evan Frick\*, Peter Jin\*, Tianle Li\*, Karthik Ganesan, Jian Zhang, Jiantao Jiao and Banghua Zhu).
|
18 |
+
- **Model type:** Reward Model
|
19 |
+
- **Finetuned from model:** [Llama-3-70B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct).
|
20 |
+
- **License**: [Nexusflow Research License](https://huggingface.co/Nexusflow/Athene-70B/blob/main/Nexusflow_Research_License.pdf)
|
21 |
+
- **Blog**: https://nexusflow.ai/blogs/athene
|
22 |
+
|
23 |
+
|
24 |
### Usage
|
25 |
|
26 |
```python
|
|
|
115 |
print(pipe([messages])) # Print the reward!
|
116 |
|
117 |
|
118 |
+
```
|
119 |
+
|
120 |
+
### Citation
|
121 |
+
|
122 |
+
```
|
123 |
+
@misc{Athene2024,
|
124 |
+
title = {Athene-70B: Redefining the Boundaries of Post-Training for Open Models},
|
125 |
+
url = {https://nexusflow.ai/blogs/athene},
|
126 |
+
author = {Frick, Evan and Jin, Peter and Li, Tianle and Ganesan, Karthik and Zhang, Jian and Jiao, Jiantao and Zhu, Banghua},
|
127 |
+
month = {July},
|
128 |
+
year = {2024}
|
129 |
+
}
|
130 |
+
```
|