nenad1002
/

quantum-research-bot-v1.0

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

nenad1002 commited on Sep 2

Commit

29eef8e

•

1 Parent(s): acf3a5d

Update README.md

Files changed (1) hide show

README.md +5 -6

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ license: mit
 # Model Card for Model ID
-Quantum Research Bot is a chat model fined tuned over the latest research data in quantum science. It contains data from the second half of 2024 making it more accurate than general-purpose models.
 ## Model Details
@@ -47,12 +47,11 @@ Although this model should be able to generalize well, the quantum science termi
 ## Bias, Risks, and Limitations
 The model does hallucinate on certain edge cases (more coming soon).
 <!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.
 ## How to Get Started with the Model
@@ -62,7 +61,7 @@ Please refer to the instructions for the Meta Instruct models; the principle is
 ### Training Data
-Initially trained on a bit less than 3k entries, it was later expanded t 5k high quality questions and answers to make the best of supervised fine tuning.
 The dataset was generated by crawling the https://quantum-journal.org/ site, and passing data into the OpenAI gpt-4-turbo model with various prompts to ensure high quality data generation.
@@ -103,7 +102,7 @@ Following an extensive grid search, supervised fine-tuning of Llama 3.1-8B with
 - LR scheduler: cosine
 - NEFT enabled: true
 - Batch size: 8
-- Number of epochs: 3
 #### Speeds, Sizes, Times [optional]

 # Model Card for Model ID
+Quantum Research Bot is a chat model fine-tuned on the latest quantum science research data. It includes data from the second half of 2024, making it more accurate and up-to-date than general-purpose models.
 ## Model Details
 ## Bias, Risks, and Limitations
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model.
 The model does hallucinate on certain edge cases (more coming soon).
 <!-- This section is meant to convey both technical and sociotechnical limitations. -->
 ## How to Get Started with the Model
 ### Training Data
+Initially trained on a bit less than 3k entries, it was later expanded t 5k high quality questions and answers to make the best of supervised fine tuning. The evaluation set consisted of about ~200 entries in the final training round.
 The dataset was generated by crawling the https://quantum-journal.org/ site, and passing data into the OpenAI gpt-4-turbo model with various prompts to ensure high quality data generation.
 - LR scheduler: cosine
 - NEFT enabled: true
 - Batch size: 8
+- Number of epochs: 4
 #### Speeds, Sizes, Times [optional]