FemkeBakker
/

AmsterdamDocClassificationLlama200T1Epochs

@@ -8,6 +8,10 @@ tags:
 model-index:
 - name: AmsterdamDocClassificationLlama200T1Epochs
  results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -15,24 +19,24 @@ should probably proofread and complete it, then remove this comment. -->
 # AmsterdamDocClassificationLlama200T1Epochs
-This model is a fine-tuned version of [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) on the [AmsterdamDocClassification](https://huggingface.co/datasets/FemkeBakker/AmsterdamBalancedFirst200Tokens) dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.8403
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
@@ -57,6 +61,7 @@ The following hyperparameters were used during training:
 | 0.7408 | 0.7952 | 492 | 0.8413 |
 | 0.996 | 0.9939 | 615 | 0.8403 |
 ### Framework versions
@@ -64,3 +69,7 @@ The following hyperparameters were used during training:
 - Pytorch 2.3.0+cu121
 - Datasets 2.19.1
 - Tokenizers 0.19.1

 model-index:
 - name: AmsterdamDocClassificationLlama200T1Epochs
  results: []
+datasets:
+- FemkeBakker/AmsterdamBalancedFirst200Tokens
+language:
+- nl
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 # AmsterdamDocClassificationLlama200T1Epochs
+As part of the Assessing Large Language Models for Document Classification project by the Municipality of Amsterdam, we fine-tune Mistral, Llama, and GEITje for document classification.
+The fine-tuning is performed using the [AmsterdamBalancedFirst200Tokens](https://huggingface.co/datasets/FemkeBakker/AmsterdamBalancedFirst200Tokens) dataset, which consists of documents truncated to the first 200 tokens.
+In our research, we evaluate the fine-tuning of these LLMs across one, two, and three epochs.
+This model is a fine-tuned version of [meta-llama/Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf) and has been fine-tuned for one epoch.
 It achieves the following results on the evaluation set:
 - Loss: 0.8403
 ## Training and evaluation data
+- The training data consists of 9900 documents and their labels formatted into conversations.
+- The evaluation data consists of 1100 documents and their labels formatted into conversations.
 ## Training procedure
+See the [GitHub](https://github.com/Amsterdam-Internships/document-classification-using-large-language-models) for specifics about the training and the code.
 ### Training hyperparameters
 The following hyperparameters were used during training:
 | 0.7408 | 0.7952 | 492 | 0.8413 |
 | 0.996 | 0.9939 | 615 | 0.8403 |
+Training time: in total it took 39 minutes to fine-tune the model for one epoch.
 ### Framework versions
 - Pytorch 2.3.0+cu121
 - Datasets 2.19.1
 - Tokenizers 0.19.1
+### Acknowledgements
+This model was trained as part of [insert thesis info] in collaboration with Amsterdam Intelligence for the City of Amsterdam.