Update README.md
Browse files
README.md
CHANGED
@@ -18,10 +18,10 @@ datasets:
|
|
18 |
|
19 |
cosmosage is a natural-language cosmology assistant that can answer questions about cosmology.
|
20 |
|
21 |
-
cosmosage-v3 is the latest iteration in the cosmosage series
|
22 |
-
model. We started with continued pretraining on thousands of papers and textbooks.
|
23 |
-
was fine-tuning on synthetically-generated question-answer pairs. In addition, the
|
24 |
-
dataset was used to improve instruction following and general conversational capability.
|
25 |
|
26 |
cosmosage-v3 is a full chat model, though it excels in Q&A mode, where the model gives a single
|
27 |
answer in response to a single question.
|
@@ -30,7 +30,8 @@ The code used to generate cosmosage is available at https://github.com/tijmen/co
|
|
30 |
|
31 |
## Usage
|
32 |
|
33 |
-
cosmosage-v3 uses the Llama-3 prompt template. Sampling parameters are up to you, but I like
|
|
|
34 |
|
35 |
## Comparison to cosmosage_v2
|
36 |
|
@@ -40,7 +41,8 @@ model.
|
|
40 |
|
41 |
## Training details
|
42 |
|
43 |
-
cosmosage-v3 was trained on 4xA100 (40 GB) at Gadi
|
|
|
44 |
|
45 |
## Example output
|
46 |
|
|
|
18 |
|
19 |
cosmosage is a natural-language cosmology assistant that can answer questions about cosmology.
|
20 |
|
21 |
+
cosmosage-v3 is the latest iteration in the cosmosage series. It was trained on top of the
|
22 |
+
LLAMA-3-8B base model. We started with continued pretraining on thousands of papers and textbooks.
|
23 |
+
The next step was fine-tuning on synthetically-generated question-answer pairs. In addition, the
|
24 |
+
OpenHermes 2.5 dataset was used to improve instruction following and general conversational capability.
|
25 |
|
26 |
cosmosage-v3 is a full chat model, though it excels in Q&A mode, where the model gives a single
|
27 |
answer in response to a single question.
|
|
|
30 |
|
31 |
## Usage
|
32 |
|
33 |
+
cosmosage-v3 uses the Llama-3 prompt template. Sampling parameters are up to you, but I like
|
34 |
+
{'temperature': 0.7, 'smoothing_factor': 1, 'smoothing_curve': 1.5, 'repetition_penalty': 1.1}.
|
35 |
|
36 |
## Comparison to cosmosage_v2
|
37 |
|
|
|
41 |
|
42 |
## Training details
|
43 |
|
44 |
+
cosmosage-v3 was trained on 4xA100 (40 GB) at the Gadi supercomputer, part of NCI, Australia. A big
|
45 |
+
thanks goes out to Yuan-Seng Ting for providing these resources.
|
46 |
|
47 |
## Example output
|
48 |
|