Model description
This repo contains the model which showcases the learning capabilities of LSTM using a simple example. A single-layer LSTM is made to learn to add two numbers, provided as strings. The model has been trained for adding two numbers where each number can have maximum of 5 digits.
Example: Input: "535+61" Output: "596"
Full credits to Smerity and others for this work.
Intended uses & limitations
More information needed
Training and evaluation data
The data consists of generation of two random 5 digit numbers as input and their sum as output. These numbers (and their sum) are encoded and fed as input to LSTM. The full data creation code is available within the example.
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 32
- optimizer: {'name': 'Adam', 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False}
- training_precision: float32
- num_epochs: 30
Training Metrics
Epochs | Train Loss | Train Accuracy | Validation Loss | Validation Accuracy |
---|---|---|---|---|
1 | 0.071 | 0.977 | 0.12 | 0.957 |
2 | 0.063 | 0.98 | 0.102 | 0.963 |
3 | 0.061 | 0.98 | 0.094 | 0.967 |
4 | 0.051 | 0.983 | 0.091 | 0.969 |
5 | 0.052 | 0.983 | 0.093 | 0.964 |
6 | 0.05 | 0.983 | 0.073 | 0.975 |
7 | 0.039 | 0.988 | 0.069 | 0.976 |
8 | 0.054 | 0.983 | 0.103 | 0.965 |
9 | 0.04 | 0.987 | 0.063 | 0.977 |
10 | 0.033 | 0.99 | 0.141 | 0.953 |
11 | 0.037 | 0.989 | 0.083 | 0.971 |
12 | 0.04 | 0.987 | 0.069 | 0.976 |
13 | 0.027 | 0.992 | 0.053 | 0.98 |
14 | 0.03 | 0.991 | 0.071 | 0.974 |
15 | 0.03 | 0.991 | 0.061 | 0.979 |
16 | 0.029 | 0.991 | 0.048 | 0.982 |
17 | 0.037 | 0.989 | 0.091 | 0.97 |
18 | 0.023 | 0.993 | 0.039 | 0.987 |
19 | 0.028 | 0.991 | 0.058 | 0.981 |
20 | 0.022 | 0.994 | 0.057 | 0.98 |
21 | 0.023 | 0.993 | 0.038 | 0.987 |
22 | 0.034 | 0.99 | 0.054 | 0.982 |
23 | 0.026 | 0.993 | 0.12 | 0.959 |
24 | 0.027 | 0.992 | 0.034 | 0.989 |
25 | 0.022 | 0.993 | 0.047 | 0.984 |
26 | 0.02 | 0.994 | 0.062 | 0.978 |
27 | 0.024 | 0.993 | 0.043 | 0.985 |
28 | 0.019 | 0.994 | 0.057 | 0.979 |
29 | 0.017 | 0.995 | 0.054 | 0.982 |
30 | 0.021 | 0.994 | 0.033 | 0.989 |
Model Plot
- Downloads last month
- 6
Inference API (serverless) does not yet support tf-keras models for this pipeline type.