Removes resnet paper link to avoid confusion.
Browse files
README.md
CHANGED
@@ -72,7 +72,7 @@ SuSy is a Spatial-Based Synthetic Image Detection and Recognition Model, designe
|
|
72 |
|
73 |
The model is based on a CNN architecture and is trained using a supervised learning approach. It's design is based on [previous work](https://upcommons.upc.edu/handle/2117/395959), originally intended for video superresolution detection, adapted here for the tasks of synthetic image detection and recognition. The architecture consists of two modules: a feature extractor and a multi-layer perceptron (MLP), as it's quite light weight. SuSy has a total of 12.7M parameters, with the feature extractor accounting for 12.5M parameters and the MLP accounting for the remaining 197K.
|
74 |
|
75 |
-
The CNN feature extractor consists of five stages following a
|
76 |
|
77 |
The outputs of each level of bottlenecks and stage 4 are passed to a 2D adaptative average pooling layer and then concatenated to form the feature map feeding the MLP. The MLP consists of three fully connected layers with 512, 256 and 256 units, respectively. Between each layer, a dropout layer (rate of 0.5) prevents overfitting. The output of the MLP has 6 units, corresponding to the number of classes in the dataset (5 synthetic models and 1 real image class).
|
78 |
|
|
|
72 |
|
73 |
The model is based on a CNN architecture and is trained using a supervised learning approach. It's design is based on [previous work](https://upcommons.upc.edu/handle/2117/395959), originally intended for video superresolution detection, adapted here for the tasks of synthetic image detection and recognition. The architecture consists of two modules: a feature extractor and a multi-layer perceptron (MLP), as it's quite light weight. SuSy has a total of 12.7M parameters, with the feature extractor accounting for 12.5M parameters and the MLP accounting for the remaining 197K.
|
74 |
|
75 |
+
The CNN feature extractor consists of five stages following a ResNet-18 scheme. The output of each of the blocks is used as input for various bottleneck modules that are arranged in a staircase pattern. The bottleneck modules consist of three 2D convolutional layers. Each level of bottlenecks takes input at a later stage than the previous level, and each bottleneck module takes input from the current stage and, except the first bottleneck of each level, from the previous bottleneck module.
|
76 |
|
77 |
The outputs of each level of bottlenecks and stage 4 are passed to a 2D adaptative average pooling layer and then concatenated to form the feature map feeding the MLP. The MLP consists of three fully connected layers with 512, 256 and 256 units, respectively. Between each layer, a dropout layer (rate of 0.5) prevents overfitting. The output of the MLP has 6 units, corresponding to the number of classes in the dataset (5 synthetic models and 1 real image class).
|
78 |
|