jiazhengli commited on
Commit
2fb28ff
1 Parent(s): 7d35748

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -12
README.md CHANGED
@@ -9,31 +9,55 @@ base_model: mistralai/Mixtral-8x7B-Instruct-v0.1
9
  model-index:
10
  - name: sft_trained_woaqa_mixtral
11
  results: []
 
 
 
 
 
 
 
 
12
  ---
13
 
14
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
- should probably proofread and complete it, then remove this comment. -->
16
 
17
- # sft_trained_woaqa_mixtral
18
 
19
- This model is a fine-tuned version of [mistralai/Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) on the sft_wo_aqa_mistral dataset.
20
- It achieves the following results on the evaluation set:
21
- - Loss: 0.8062
22
-
23
- ## Model description
24
-
25
- More information needed
26
 
27
  ## Intended uses & limitations
28
 
29
- More information needed
30
 
31
  ## Training and evaluation data
32
 
33
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
 
35
  ## Training procedure
36
 
 
 
37
  ### Training hyperparameters
38
 
39
  The following hyperparameters were used during training:
 
9
  model-index:
10
  - name: sft_trained_woaqa_mixtral
11
  results: []
12
+ datasets:
13
+ - jiazhengli/Rationale_MCTS
14
+ - jiazhengli/Synthetic_Rationale
15
+ language:
16
+ - en
17
+ metrics:
18
+ - accuracy
19
+ - f1
20
  ---
21
 
22
+ # Mixtral-8x7B-Instruct-v0.1-QLoRA-Assessment-Rationale-sft
 
23
 
24
+ The model trained with w/o private data from the EMNLP 2024 Paper: Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring.
25
 
26
+ - **Paper:** [Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring](https://arxiv.org/abs/2406.19949) (EMNLP 2024 Findings)
27
+ - **GitHub Repository:** [Thought Tree Assessment Repository](https://github.com/lijiazheng99/thought_tree_assessment)
 
 
 
 
 
28
 
29
  ## Intended uses & limitations
30
 
31
+ This model offers a valuable resource for research in explainable AI within educational technology. The model is trained with **noisy** response-level rationales. This makes them **unsuitable** for direct application in high-stakes assessments without additional verification.
32
 
33
  ## Training and evaluation data
34
 
35
+ We trained and evaluated the model on the [Synthetic Rationale data](https://huggingface.co/datasets/jiazhengli/Synthetic_Rationale), which was generated from the [Rationale MCTS data](https://huggingface.co/datasets/jiazhengli/Rationale_MCTS).
36
+
37
+ To extract scores from rationales, please use the [jiazhengli/deberta-v3-large-Rationale-to-Score](https://huggingface.co/jiazhengli/deberta-v3-large-Rationale-to-Score).
38
+
39
+ ## Citation
40
+
41
+ Please cite the following work if you utilize this model:
42
+
43
+ **BibTeX:**
44
+
45
+ ```bibtex
46
+ @misc{li2024calibratingllmspreferenceoptimization,
47
+ title={Calibrating LLMs with Preference Optimization on Thought Trees for Generating Rationale in Science Question Scoring},
48
+ author={Jiazheng Li and Hainiu Xu and Zhaoyue Sun and Yuxiang Zhou and David West and Cesare Aloisi and Yulan He},
49
+ year={2024},
50
+ eprint={2406.19949},
51
+ archivePrefix={arXiv},
52
+ primaryClass={cs.CL},
53
+ url={https://arxiv.org/abs/2406.19949},
54
+ }
55
+ ```
56
 
57
  ## Training procedure
58
 
59
+ Please refer to our [paper](https://arxiv.org/abs/2406.19949).
60
+
61
  ### Training hyperparameters
62
 
63
  The following hyperparameters were used during training: