lgfunderburk
/

distilbert-truncated

Text Classification

generated_from_keras_callback

Inference Endpoints

Model card Files Files and versions Community

lgfunderburk commited on May 17, 2023

Commit

90f5a69

·

1 Parent(s): 673ad75

add tokenizer info

Files changed (1) hide show

README.md +8 -10

README.md CHANGED Viewed

@@ -10,24 +10,21 @@ model-index:
 # distilbert-truncated
-This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the 20 Newsgroups dataset (http://qwone.com/~jason/20Newsgroups/).
 It achieves the following results on the evaluation set:
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:
@@ -41,6 +38,7 @@ batches_per_epoch = 636
 total_train_steps = 1908
 Model accuracy 0.8337758779525757
 Model loss 0.568471074104309
 ### Framework versions

 # distilbert-truncated
+This model is a fine-tuned version of [distilbert-base-uncased](https://huggingface.co/distilbert-base-uncased) on the [20 Newsgroups dataset](http://qwone.com/~jason/20Newsgroups/).
 It achieves the following results on the evaluation set:
 ## Training and evaluation data
+The data was split into training and testing: model trained on 90% of the data, and had a testing data size of 10% of the original dataset.
 ## Training procedure
+DistilBERT has a maximum input length of 512, so with this in mind the following was performed:
+1. I used the`distilbert-base-uncased` pretrained model to initialize an `AutoTokenizer`.
+2. Setting a maximum length of 256, each entry in the training, testing and validation data was truncated if it exceeded the limit and padded if it didn't reach the limit.
 ### Training hyperparameters
 The following hyperparameters were used during training:
 total_train_steps = 1908
 Model accuracy 0.8337758779525757
 Model loss 0.568471074104309
 ### Framework versions