TF-Keras
pablorodriper commited on
Commit
99a76b8
1 Parent(s): 55e85e6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -1
README.md CHANGED
@@ -18,4 +18,31 @@ This repo contains the model [to this Keras example on Video Vision Transformer]
18
  This example implements [ViViT: A Video Vision Transformer](https://arxiv.org/abs/2103.15691) by Arnab et al., a pure Transformer-based model for video classification. The authors propose a novel embedding scheme and a number of Transformer variants to model video clips.
19
 
20
  ## Datasets
21
- We use the [MedMNIST v2: A Large-Scale Lightweight Benchmark for 2D and 3D Biomedical Image Classification](https://medmnist.com/) dataset.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  This example implements [ViViT: A Video Vision Transformer](https://arxiv.org/abs/2103.15691) by Arnab et al., a pure Transformer-based model for video classification. The authors propose a novel embedding scheme and a number of Transformer variants to model video clips.
19
 
20
  ## Datasets
21
+ We use the [MedMNIST v2: A Large-Scale Lightweight Benchmark for 2D and 3D Biomedical Image Classification](https://medmnist.com/) dataset.
22
+
23
+ ## Training Parameters
24
+ ```
25
+ # DATA
26
+ DATASET_NAME = "organmnist3d"
27
+ BATCH_SIZE = 32
28
+ AUTO = tf.data.AUTOTUNE
29
+ INPUT_SHAPE = (28, 28, 28, 1)
30
+ NUM_CLASSES = 11
31
+
32
+ # OPTIMIZER
33
+ LEARNING_RATE = 1e-4
34
+ WEIGHT_DECAY = 1e-5
35
+
36
+ # TRAINING
37
+ EPOCHS = 80
38
+
39
+ # TUBELET EMBEDDING
40
+ PATCH_SIZE = (8, 8, 8)
41
+ NUM_PATCHES = (INPUT_SHAPE[0] // PATCH_SIZE[0]) ** 2
42
+
43
+ # ViViT ARCHITECTURE
44
+ LAYER_NORM_EPS = 1e-6
45
+ PROJECTION_DIM = 128
46
+ NUM_HEADS = 8
47
+ NUM_LAYERS = 8
48
+ ```