Tijmen2
/

cosmosage-v3

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Tijmen2 commited on Jun 28

Commit

b6e2e2a

•

1 Parent(s): 396f8e9

Update README.md

Files changed (1) hide show

README.md +8 -6

README.md CHANGED Viewed

@@ -18,10 +18,10 @@ datasets:
 cosmosage is a natural-language cosmology assistant that can answer questions about cosmology.
-cosmosage-v3 is the latest iteration in the cosmosage series, trained on the LLAMA-3-8B base
-model. We started with continued pretraining on thousands of papers and textbooks. The next step
-was fine-tuning on synthetically-generated question-answer pairs. In addition, the OpenHermes 2.5
-dataset was used to improve instruction following and general conversational capability.
 cosmosage-v3 is a full chat model, though it excels in Q&A mode, where the model gives a single
 answer in response to a single question.
@@ -30,7 +30,8 @@ The code used to generate cosmosage is available at https://github.com/tijmen/co
 ## Usage
-cosmosage-v3 uses the Llama-3 prompt template. Sampling parameters are up to you, but I like {'temperature': 0.7, 'smoothing_factor': 1, 'smoothing_curve': 1.5, 'repetition_penalty': 1.1}.
 ## Comparison to cosmosage_v2
@@ -40,7 +41,8 @@ model.
 ## Training details
-cosmosage-v3 was trained on 4xA100 (40 GB) at Gadi (NCI, Australia). A big thanks goes out to Yuan-Seng Ting for providing these resources.
 ## Example output

 cosmosage is a natural-language cosmology assistant that can answer questions about cosmology.
+cosmosage-v3 is the latest iteration in the cosmosage series. It was trained on top of the
+LLAMA-3-8B base model. We started with continued pretraining on thousands of papers and textbooks.
+The next step was fine-tuning on synthetically-generated question-answer pairs. In addition, the
+OpenHermes 2.5 dataset was used to improve instruction following and general conversational capability.
 cosmosage-v3 is a full chat model, though it excels in Q&A mode, where the model gives a single
 answer in response to a single question.
 ## Usage
+cosmosage-v3 uses the Llama-3 prompt template. Sampling parameters are up to you, but I like
+{'temperature': 0.7, 'smoothing_factor': 1, 'smoothing_curve': 1.5, 'repetition_penalty': 1.1}.
 ## Comparison to cosmosage_v2
 ## Training details
+cosmosage-v3 was trained on 4xA100 (40 GB) at the Gadi supercomputer, part of NCI, Australia. A big
+thanks goes out to Yuan-Seng Ting for providing these resources.
 ## Example output