nenad1002
/

quantum-research-bot-v1.0

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

nenad1002 commited on Sep 3

Commit

03d09c1

•

1 Parent(s): fb36358

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -72,7 +72,7 @@ The dataset was generated by crawling the https://quantum-journal.org/ site, and
 Various training procedures were explored alongside multiple models, however, all of them were parameter efficient.
-Over time, several models and fine-tuning approaches were tested as the base model. The best performance was achieved with [Llama 3.1 70B Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct) and qLoRA, but the training duration was extensive, and optimizing hyperparameters proved to be highly challenging.
 Other base models were also tested: [Mistral 7B v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1), [Meta-Llama/Llama-2-7b-chat-hf](Meta-Llama/Llama-2-7b-chat-hf), and the base model of this experiment.

 Various training procedures were explored alongside multiple models, however, all of them were parameter efficient.
+Over time, several models and fine-tuning approaches were tested as the base model. The best accuracy was achieved with [Llama 3.1 70B Instruct](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct) and qLoRA, but the training duration was extensive, and optimizing hyperparameters proved to be highly challenging.
 Other base models were also tested: [Mistral 7B v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1), [Meta-Llama/Llama-2-7b-chat-hf](Meta-Llama/Llama-2-7b-chat-hf), and the base model of this experiment.