nenad1002 commited on
Commit
29eef8e
1 Parent(s): acf3a5d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -6
README.md CHANGED
@@ -16,7 +16,7 @@ license: mit
16
 
17
  # Model Card for Model ID
18
 
19
- Quantum Research Bot is a chat model fined tuned over the latest research data in quantum science. It contains data from the second half of 2024 making it more accurate than general-purpose models.
20
 
21
  ## Model Details
22
 
@@ -47,12 +47,11 @@ Although this model should be able to generalize well, the quantum science termi
47
 
48
  ## Bias, Risks, and Limitations
49
 
 
 
50
  The model does hallucinate on certain edge cases (more coming soon).
51
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
52
 
53
- [More Information Needed]
54
-
55
- Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.
56
 
57
  ## How to Get Started with the Model
58
 
@@ -62,7 +61,7 @@ Please refer to the instructions for the Meta Instruct models; the principle is
62
 
63
  ### Training Data
64
 
65
- Initially trained on a bit less than 3k entries, it was later expanded t 5k high quality questions and answers to make the best of supervised fine tuning.
66
 
67
  The dataset was generated by crawling the https://quantum-journal.org/ site, and passing data into the OpenAI gpt-4-turbo model with various prompts to ensure high quality data generation.
68
 
@@ -103,7 +102,7 @@ Following an extensive grid search, supervised fine-tuning of Llama 3.1-8B with
103
  - LR scheduler: cosine
104
  - NEFT enabled: true
105
  - Batch size: 8
106
- - Number of epochs: 3
107
 
108
 
109
  #### Speeds, Sizes, Times [optional]
 
16
 
17
  # Model Card for Model ID
18
 
19
+ Quantum Research Bot is a chat model fine-tuned on the latest quantum science research data. It includes data from the second half of 2024, making it more accurate and up-to-date than general-purpose models.
20
 
21
  ## Model Details
22
 
 
47
 
48
  ## Bias, Risks, and Limitations
49
 
50
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.
51
+
52
  The model does hallucinate on certain edge cases (more coming soon).
53
  <!-- This section is meant to convey both technical and sociotechnical limitations. -->
54
 
 
 
 
55
 
56
  ## How to Get Started with the Model
57
 
 
61
 
62
  ### Training Data
63
 
64
+ Initially trained on a bit less than 3k entries, it was later expanded t 5k high quality questions and answers to make the best of supervised fine tuning. The evaluation set consisted of about ~200 entries in the final training round.
65
 
66
  The dataset was generated by crawling the https://quantum-journal.org/ site, and passing data into the OpenAI gpt-4-turbo model with various prompts to ensure high quality data generation.
67
 
 
102
  - LR scheduler: cosine
103
  - NEFT enabled: true
104
  - Batch size: 8
105
+ - Number of epochs: 4
106
 
107
 
108
  #### Speeds, Sizes, Times [optional]