Update README.md
Browse files
README.md
CHANGED
@@ -16,7 +16,7 @@ license: mit
|
|
16 |
|
17 |
# Model Card for Model ID
|
18 |
|
19 |
-
Quantum Research Bot is a chat model
|
20 |
|
21 |
## Model Details
|
22 |
|
@@ -47,12 +47,11 @@ Although this model should be able to generalize well, the quantum science termi
|
|
47 |
|
48 |
## Bias, Risks, and Limitations
|
49 |
|
|
|
|
|
50 |
The model does hallucinate on certain edge cases (more coming soon).
|
51 |
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
52 |
|
53 |
-
[More Information Needed]
|
54 |
-
|
55 |
-
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.
|
56 |
|
57 |
## How to Get Started with the Model
|
58 |
|
@@ -62,7 +61,7 @@ Please refer to the instructions for the Meta Instruct models; the principle is
|
|
62 |
|
63 |
### Training Data
|
64 |
|
65 |
-
Initially trained on a bit less than 3k entries, it was later expanded t 5k high quality questions and answers to make the best of supervised fine tuning.
|
66 |
|
67 |
The dataset was generated by crawling the https://quantum-journal.org/ site, and passing data into the OpenAI gpt-4-turbo model with various prompts to ensure high quality data generation.
|
68 |
|
@@ -103,7 +102,7 @@ Following an extensive grid search, supervised fine-tuning of Llama 3.1-8B with
|
|
103 |
- LR scheduler: cosine
|
104 |
- NEFT enabled: true
|
105 |
- Batch size: 8
|
106 |
-
- Number of epochs:
|
107 |
|
108 |
|
109 |
#### Speeds, Sizes, Times [optional]
|
|
|
16 |
|
17 |
# Model Card for Model ID
|
18 |
|
19 |
+
Quantum Research Bot is a chat model fine-tuned on the latest quantum science research data. It includes data from the second half of 2024, making it more accurate and up-to-date than general-purpose models.
|
20 |
|
21 |
## Model Details
|
22 |
|
|
|
47 |
|
48 |
## Bias, Risks, and Limitations
|
49 |
|
50 |
+
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.
|
51 |
+
|
52 |
The model does hallucinate on certain edge cases (more coming soon).
|
53 |
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
54 |
|
|
|
|
|
|
|
55 |
|
56 |
## How to Get Started with the Model
|
57 |
|
|
|
61 |
|
62 |
### Training Data
|
63 |
|
64 |
+
Initially trained on a bit less than 3k entries, it was later expanded t 5k high quality questions and answers to make the best of supervised fine tuning. The evaluation set consisted of about ~200 entries in the final training round.
|
65 |
|
66 |
The dataset was generated by crawling the https://quantum-journal.org/ site, and passing data into the OpenAI gpt-4-turbo model with various prompts to ensure high quality data generation.
|
67 |
|
|
|
102 |
- LR scheduler: cosine
|
103 |
- NEFT enabled: true
|
104 |
- Batch size: 8
|
105 |
+
- Number of epochs: 4
|
106 |
|
107 |
|
108 |
#### Speeds, Sizes, Times [optional]
|