Hannes Kuchelmeister commited on Feb 14, 2022

Commit

b72a776

1 Parent(s): 3d5c288

add model to repository

Files changed (24) hide show

model/LICENSE +21 -0
model/README.md +378 -0
model/base/__init__.py +3 -0
model/base/base_data_loader.py +61 -0
model/base/base_model.py +25 -0
model/base/base_trainer.py +151 -0
model/config.json +50 -0
model/data_loader/data_loaders.py +16 -0
model/logger/__init__.py +2 -0
model/logger/logger.py +22 -0
model/logger/logger_config.json +32 -0
model/logger/visualization.py +73 -0
model/model/loss.py +5 -0
model/model/metric.py +20 -0
model/model/model.py +22 -0
model/new_project.py +18 -0
model/parse_config.py +157 -0
model/requirements.txt +5 -0
model/test.py +81 -0
model/train.py +73 -0
model/trainer/__init__.py +1 -0
model/trainer/trainer.py +110 -0
model/utils/__init__.py +1 -0
model/utils/util.py +67 -0

model/LICENSE ADDED Viewed

	@@ -0,0 +1,21 @@

+MIT License
+Copyright (c) 2018 Victor Huang
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

model/README.md ADDED Viewed

	@@ -0,0 +1,378 @@

+# PyTorch Template Project
+PyTorch deep learning project made easy.
+<!-- @import "[TOC]" {cmd="toc" depthFrom=1 depthTo=6 orderedList=false} -->
+<!-- code_chunk_output -->
+* [PyTorch Template Project](#pytorch-template-project)
+	* [Requirements](#requirements)
+	* [Features](#features)
+	* [Folder Structure](#folder-structure)
+	* [Usage](#usage)
+		* [Config file format](#config-file-format)
+		* [Using config files](#using-config-files)
+		* [Resuming from checkpoints](#resuming-from-checkpoints)
+    * [Using Multiple GPU](#using-multiple-gpu)
+	* [Customization](#customization)
+		* [Custom CLI options](#custom-cli-options)
+		* [Data Loader](#data-loader)
+		* [Trainer](#trainer)
+		* [Model](#model)
+		* [Loss](#loss)
+		* [metrics](#metrics)
+		* [Additional logging](#additional-logging)
+		* [Validation data](#validation-data)
+		* [Checkpoints](#checkpoints)
+    * [Tensorboard Visualization](#tensorboard-visualization)
+	* [Contribution](#contribution)
+	* [TODOs](#todos)
+	* [License](#license)
+	* [Acknowledgements](#acknowledgements)
+<!-- /code_chunk_output -->
+## Requirements
+* Python >= 3.5 (3.6 recommended)
+* PyTorch >= 0.4 (1.2 recommended)
+* tqdm (Optional for `test.py`)
+* tensorboard >= 1.14 (see [Tensorboard Visualization](#tensorboard-visualization))
+## Features
+* Clear folder structure which is suitable for many deep learning projects.
+* `.json` config file support for convenient parameter tuning.
+* Customizable command line options for more convenient parameter tuning.
+* Checkpoint saving and resuming.
+* Abstract base classes for faster development:
+  * `BaseTrainer` handles checkpoint saving/resuming, training process logging, and more.
+  * `BaseDataLoader` handles batch generation, data shuffling, and validation data splitting.
+  * `BaseModel` provides basic model summary.
+## Folder Structure
+  ```
+  pytorch-template/
+  │
+  ├── train.py - main script to start training
+  ├── test.py - evaluation of trained model
+  │
+  ├── config.json - holds configuration for training
+  ├── parse_config.py - class to handle config file and cli options
+  │
+  ├── new_project.py - initialize new project with template files
+  │
+  ├── base/ - abstract base classes
+  │   ├── base_data_loader.py
+  │   ├── base_model.py
+  │   └── base_trainer.py
+  │
+  ├── data_loader/ - anything about data loading goes here
+  │   └── data_loaders.py
+  │
+  ├── data/ - default directory for storing input data
+  │
+  ├── model/ - models, losses, and metrics
+  │   ├── model.py
+  │   ├── metric.py
+  │   └── loss.py
+  │
+  ├── saved/
+  │   ├── models/ - trained models are saved here
+  │   └── log/ - default logdir for tensorboard and logging output
+  │
+  ├── trainer/ - trainers
+  │   └── trainer.py
+  │
+  ├── logger/ - module for tensorboard visualization and logging
+  │   ├── visualization.py
+  │   ├── logger.py
+  │   └── logger_config.json
+  │
+  └── utils/ - small utility functions
+      ├── util.py
+      └── ...
+  ```
+## Usage
+The code in this repo is an MNIST example of the template.
+Try `python train.py -c config.json` to run code.
+### Config file format
+Config files are in `.json` format:
+```javascript
+{
+  "name": "Mnist_LeNet",        // training session name
+  "n_gpu": 1,                   // number of GPUs to use for training.
+  "arch": {
+    "type": "MnistModel",       // name of model architecture to train
+    "args": {
+    }
+  },
+  "data_loader": {
+    "type": "MnistDataLoader",         // selecting data loader
+    "args":{
+      "data_dir": "data/",             // dataset path
+      "batch_size": 64,                // batch size
+      "shuffle": true,                 // shuffle training data before splitting
+      "validation_split": 0.1          // size of validation dataset. float(portion) or int(number of samples)
+      "num_workers": 2,                // number of cpu processes to be used for data loading
+    }
+  },
+  "optimizer": {
+    "type": "Adam",
+    "args":{
+      "lr": 0.001,                     // learning rate
+      "weight_decay": 0,               // (optional) weight decay
+      "amsgrad": true
+    }
+  },
+  "loss": "nll_loss",                  // loss
+  "metrics": [
+    "accuracy", "top_k_acc"            // list of metrics to evaluate
+  ],
+  "lr_scheduler": {
+    "type": "StepLR",                  // learning rate scheduler
+    "args":{
+      "step_size": 50,
+      "gamma": 0.1
+    }
+  },
+  "trainer": {
+    "epochs": 100,                     // number of training epochs
+    "save_dir": "saved/",              // checkpoints are saved in save_dir/models/name
+    "save_freq": 1,                    // save checkpoints every save_freq epochs
+    "verbosity": 2,                    // 0: quiet, 1: per epoch, 2: full
+    "monitor": "min val_loss"          // mode and metric for model performance monitoring. set 'off' to disable.
+    "early_stop": 10	                 // number of epochs to wait before early stop. set 0 to disable.
+    "tensorboard": true,               // enable tensorboard visualization
+  }
+}
+```
+Add addional configurations if you need.
+### Using config files
+Modify the configurations in `.json` config files, then run:
+  ```
+  python train.py --config config.json
+  ```
+### Resuming from checkpoints
+You can resume from a previously saved checkpoint by:
+  ```
+  python train.py --resume path/to/checkpoint
+  ```
+### Using Multiple GPU
+You can enable multi-GPU training by setting `n_gpu` argument of the config file to larger number.
+If configured to use smaller number of gpu than available, first n devices will be used by default.
+Specify indices of available GPUs by cuda environmental variable.
+  ```
+  python train.py --device 2,3 -c config.json
+  ```
+  This is equivalent to
+  ```
+  CUDA_VISIBLE_DEVICES=2,3 python train.py -c config.py
+  ```
+## Customization
+### Project initialization
+Use the `new_project.py` script to make your new project directory with template files.
+`python new_project.py ../NewProject` then a new project folder named 'NewProject' will be made.
+This script will filter out unneccessary files like cache, git files or readme file.
+### Custom CLI options
+Changing values of config file is a clean, safe and easy way of tuning hyperparameters. However, sometimes
+it is better to have command line options if some values need to be changed too often or quickly.
+This template uses the configurations stored in the json file by default, but by registering custom options as follows
+you can change some of them using CLI flags.
+  ```python
+  # simple class-like object having 3 attributes, `flags`, `type`, `target`.
+  CustomArgs = collections.namedtuple('CustomArgs', 'flags type target')
+  options = [
+      CustomArgs(['--lr', '--learning_rate'], type=float, target=('optimizer', 'args', 'lr')),
+      CustomArgs(['--bs', '--batch_size'], type=int, target=('data_loader', 'args', 'batch_size'))
+      # options added here can be modified by command line flags.
+  ]
+  ```
+`target` argument should be sequence of keys, which are used to access that option in the config dict. In this example, `target`
+for the learning rate option is `('optimizer', 'args', 'lr')` because `config['optimizer']['args']['lr']` points to the learning rate.
+`python train.py -c config.json --bs 256` runs training with options given in `config.json` except for the `batch size`
+which is increased to 256 by command line options.
+### Data Loader
+* **Writing your own data loader**
+1. **Inherit ```BaseDataLoader```**
+    `BaseDataLoader` is a subclass of `torch.utils.data.DataLoader`, you can use either of them.
+    `BaseDataLoader` handles:
+    * Generating next batch
+    * Data shuffling
+    * Generating validation data loader by calling
+    `BaseDataLoader.split_validation()`
+* **DataLoader Usage**
+  `BaseDataLoader` is an iterator, to iterate through batches:
+  ```python
+  for batch_idx, (x_batch, y_batch) in data_loader:
+      pass
+  ```
+* **Example**
+  Please refer to `data_loader/data_loaders.py` for an MNIST data loading example.
+### Trainer
+* **Writing your own trainer**
+1. **Inherit ```BaseTrainer```**
+    `BaseTrainer` handles:
+    * Training process logging
+    * Checkpoint saving
+    * Checkpoint resuming
+    * Reconfigurable performance monitoring for saving current best model, and early stop training.
+      * If config `monitor` is set to `max val_accuracy`, which means then the trainer will save a checkpoint `model_best.pth` when `validation accuracy` of epoch replaces current `maximum`.
+      * If config `early_stop` is set, training will be automatically terminated when model performance does not improve for given number of epochs. This feature can be turned off by passing 0 to the `early_stop` option, or just deleting the line of config.
+2. **Implementing abstract methods**
+    You need to implement `_train_epoch()` for your training process, if you need validation then you can implement `_valid_epoch()` as in `trainer/trainer.py`
+* **Example**
+  Please refer to `trainer/trainer.py` for MNIST training.
+* **Iteration-based training**
+  `Trainer.__init__` takes an optional argument, `len_epoch` which controls number of batches(steps) in each epoch.
+### Model
+* **Writing your own model**
+1. **Inherit `BaseModel`**
+    `BaseModel` handles:
+    * Inherited from `torch.nn.Module`
+    * `__str__`: Modify native `print` function to prints the number of trainable parameters.
+2. **Implementing abstract methods**
+    Implement the foward pass method `forward()`
+* **Example**
+  Please refer to `model/model.py` for a LeNet example.
+### Loss
+Custom loss functions can be implemented in 'model/loss.py'. Use them by changing the name given in "loss" in config file, to corresponding name.
+### Metrics
+Metric functions are located in 'model/metric.py'.
+You can monitor multiple metrics by providing a list in the configuration file, e.g.:
+  ```json
+  "metrics": ["accuracy", "top_k_acc"],
+  ```
+### Additional logging
+If you have additional information to be logged, in `_train_epoch()` of your trainer class, merge them with `log` as shown below before returning:
+  ```python
+  additional_log = {"gradient_norm": g, "sensitivity": s}
+  log.update(additional_log)
+  return log
+  ```
+### Testing
+You can test trained model by running `test.py` passing path to the trained checkpoint by `--resume` argument.
+### Validation data
+To split validation data from a data loader, call `BaseDataLoader.split_validation()`, then it will return a data loader for validation of size specified in your config file.
+The `validation_split` can be a ratio of validation set per total data(0.0 <= float < 1.0), or the number of samples (0 <= int < `n_total_samples`).
+**Note**: the `split_validation()` method will modify the original data loader
+**Note**: `split_validation()` will return `None` if `"validation_split"` is set to `0`
+### Checkpoints
+You can specify the name of the training session in config files:
+  ```json
+  "name": "MNIST_LeNet",
+  ```
+The checkpoints will be saved in `save_dir/name/timestamp/checkpoint_epoch_n`, with timestamp in mmdd_HHMMSS format.
+A copy of config file will be saved in the same folder.
+**Note**: checkpoints contain:
+  ```python
+  {
+    'arch': arch,
+    'epoch': epoch,
+    'state_dict': self.model.state_dict(),
+    'optimizer': self.optimizer.state_dict(),
+    'monitor_best': self.mnt_best,
+    'config': self.config
+  }
+  ```
+### Tensorboard Visualization
+This template supports Tensorboard visualization by using either  `torch.utils.tensorboard` or [TensorboardX](https://github.com/lanpa/tensorboardX).
+1. **Install**
+    If you are using pytorch 1.1 or higher, install tensorboard by 'pip install tensorboard>=1.14.0'.
+    Otherwise, you should install tensorboardx. Follow installation guide in [TensorboardX](https://github.com/lanpa/tensorboardX).
+2. **Run training**
+    Make sure that `tensorboard` option in the config file is turned on.
+    ```
+     "tensorboard" : true
+    ```
+3. **Open Tensorboard server**
+    Type `tensorboard --logdir saved/log/` at the project root, then server will open at `http://localhost:6006`
+By default, values of loss and metrics specified in config file, input images, and histogram of model parameters will be logged.
+If you need more visualizations, use `add_scalar('tag', data)`, `add_image('tag', image)`, etc in the `trainer._train_epoch` method.
+`add_something()` methods in this template are basically wrappers for those of `tensorboardX.SummaryWriter` and `torch.utils.tensorboard.SummaryWriter` modules.
+**Note**: You don't have to specify current steps, since `WriterTensorboard` class defined at `logger/visualization.py` will track current steps.
+## Contribution
+Feel free to contribute any kind of function or enhancement, here the coding style follows PEP8
+Code should pass the [Flake8](http://flake8.pycqa.org/en/latest/) check before committing.
+## TODOs
+- [ ] Multiple optimizers
+- [ ] Support more tensorboard functions
+- [x] Using fixed random seed
+- [x] Support pytorch native tensorboard
+- [x] `tensorboardX` logger support
+- [x] Configurable logging layout, checkpoint naming
+- [x] Iteration-based training (instead of epoch-based)
+- [x] Adding command line option for fine-tuning
+## License
+This project is licensed under the MIT License. See  LICENSE for more details
+## Acknowledgements
+This project is inspired by the project [Tensorflow-Project-Template](https://github.com/MrGemy95/Tensorflow-Project-Template) by [Mahmoud Gemy](https://github.com/MrGemy95)

model/base/__init__.py ADDED Viewed

	@@ -0,0 +1,3 @@

+from .base_data_loader import *
+from .base_model import *
+from .base_trainer import *

model/base/base_data_loader.py ADDED Viewed

	@@ -0,0 +1,61 @@

+import numpy as np
+from torch.utils.data import DataLoader
+from torch.utils.data.dataloader import default_collate
+from torch.utils.data.sampler import SubsetRandomSampler
+class BaseDataLoader(DataLoader):
+    """
+    Base class for all data loaders
+    """
+    def __init__(self, dataset, batch_size, shuffle, validation_split, num_workers, collate_fn=default_collate):
+        self.validation_split = validation_split
+        self.shuffle = shuffle
+        self.batch_idx = 0
+        self.n_samples = len(dataset)
+        self.sampler, self.valid_sampler = self._split_sampler(self.validation_split)
+        self.init_kwargs = {
+            'dataset': dataset,
+            'batch_size': batch_size,
+            'shuffle': self.shuffle,
+            'collate_fn': collate_fn,
+            'num_workers': num_workers
+        }
+        super().__init__(sampler=self.sampler, **self.init_kwargs)
+    def _split_sampler(self, split):
+        if split == 0.0:
+            return None, None
+        idx_full = np.arange(self.n_samples)
+        np.random.seed(0)
+        np.random.shuffle(idx_full)
+        if isinstance(split, int):
+            assert split > 0
+            assert split < self.n_samples, "validation set size is configured to be larger than entire dataset."
+            len_valid = split
+        else:
+            len_valid = int(self.n_samples * split)
+        valid_idx = idx_full[0:len_valid]
+        train_idx = np.delete(idx_full, np.arange(0, len_valid))
+        train_sampler = SubsetRandomSampler(train_idx)
+        valid_sampler = SubsetRandomSampler(valid_idx)
+        # turn off shuffle option which is mutually exclusive with sampler
+        self.shuffle = False
+        self.n_samples = len(train_idx)
+        return train_sampler, valid_sampler
+    def split_validation(self):
+        if self.valid_sampler is None:
+            return None
+        else:
+            return DataLoader(sampler=self.valid_sampler, **self.init_kwargs)

model/base/base_model.py ADDED Viewed

	@@ -0,0 +1,25 @@

+import torch.nn as nn
+import numpy as np
+from abc import abstractmethod
+class BaseModel(nn.Module):
+    """
+    Base class for all models
+    """
+    @abstractmethod
+    def forward(self, *inputs):
+        """
+        Forward pass logic
+        :return: Model output
+        """
+        raise NotImplementedError
+    def __str__(self):
+        """
+        Model prints with number of trainable parameters
+        """
+        model_parameters = filter(lambda p: p.requires_grad, self.parameters())
+        params = sum([np.prod(p.size()) for p in model_parameters])
+        return super().__str__() + '\nTrainable parameters: {}'.format(params)

model/base/base_trainer.py ADDED Viewed

	@@ -0,0 +1,151 @@

+import torch
+from abc import abstractmethod
+from numpy import inf
+from logger import TensorboardWriter
+class BaseTrainer:
+    """
+    Base class for all trainers
+    """
+    def __init__(self, model, criterion, metric_ftns, optimizer, config):
+        self.config = config
+        self.logger = config.get_logger('trainer', config['trainer']['verbosity'])
+        self.model = model
+        self.criterion = criterion
+        self.metric_ftns = metric_ftns
+        self.optimizer = optimizer
+        cfg_trainer = config['trainer']
+        self.epochs = cfg_trainer['epochs']
+        self.save_period = cfg_trainer['save_period']
+        self.monitor = cfg_trainer.get('monitor', 'off')
+        # configuration to monitor model performance and save best
+        if self.monitor == 'off':
+            self.mnt_mode = 'off'
+            self.mnt_best = 0
+        else:
+            self.mnt_mode, self.mnt_metric = self.monitor.split()
+            assert self.mnt_mode in ['min', 'max']
+            self.mnt_best = inf if self.mnt_mode == 'min' else -inf
+            self.early_stop = cfg_trainer.get('early_stop', inf)
+            if self.early_stop <= 0:
+                self.early_stop = inf
+        self.start_epoch = 1
+        self.checkpoint_dir = config.save_dir
+        # setup visualization writer instance
+        self.writer = TensorboardWriter(config.log_dir, self.logger, cfg_trainer['tensorboard'])
+        if config.resume is not None:
+            self._resume_checkpoint(config.resume)
+    @abstractmethod
+    def _train_epoch(self, epoch):
+        """
+        Training logic for an epoch
+        :param epoch: Current epoch number
+        """
+        raise NotImplementedError
+    def train(self):
+        """
+        Full training logic
+        """
+        not_improved_count = 0
+        for epoch in range(self.start_epoch, self.epochs + 1):
+            result = self._train_epoch(epoch)
+            # save logged informations into log dict
+            log = {'epoch': epoch}
+            log.update(result)
+            # print logged informations to the screen
+            for key, value in log.items():
+                self.logger.info('    {:15s}: {}'.format(str(key), value))
+            # evaluate model performance according to configured metric, save best checkpoint as model_best
+            best = False
+            if self.mnt_mode != 'off':
+                try:
+                    # check whether model performance improved or not, according to specified metric(mnt_metric)
+                    improved = (self.mnt_mode == 'min' and log[self.mnt_metric] <= self.mnt_best) or \
+                               (self.mnt_mode == 'max' and log[self.mnt_metric] >= self.mnt_best)
+                except KeyError:
+                    self.logger.warning("Warning: Metric '{}' is not found. "
+                                        "Model performance monitoring is disabled.".format(self.mnt_metric))
+                    self.mnt_mode = 'off'
+                    improved = False
+                if improved:
+                    self.mnt_best = log[self.mnt_metric]
+                    not_improved_count = 0
+                    best = True
+                else:
+                    not_improved_count += 1
+                if not_improved_count > self.early_stop:
+                    self.logger.info("Validation performance didn\'t improve for {} epochs. "
+                                     "Training stops.".format(self.early_stop))
+                    break
+            if epoch % self.save_period == 0:
+                self._save_checkpoint(epoch, save_best=best)
+    def _save_checkpoint(self, epoch, save_best=False):
+        """
+        Saving checkpoints
+        :param epoch: current epoch number
+        :param log: logging information of the epoch
+        :param save_best: if True, rename the saved checkpoint to 'model_best.pth'
+        """
+        arch = type(self.model).__name__
+        state = {
+            'arch': arch,
+            'epoch': epoch,
+            'state_dict': self.model.state_dict(),
+            'optimizer': self.optimizer.state_dict(),
+            'monitor_best': self.mnt_best,
+            'config': self.config
+        }
+        filename = str(self.checkpoint_dir / 'checkpoint-epoch{}.pth'.format(epoch))
+        torch.save(state, filename)
+        self.logger.info("Saving checkpoint: {} ...".format(filename))
+        if save_best:
+            best_path = str(self.checkpoint_dir / 'model_best.pth')
+            torch.save(state, best_path)
+            self.logger.info("Saving current best: model_best.pth ...")
+    def _resume_checkpoint(self, resume_path):
+        """
+        Resume from saved checkpoints
+        :param resume_path: Checkpoint path to be resumed
+        """
+        resume_path = str(resume_path)
+        self.logger.info("Loading checkpoint: {} ...".format(resume_path))
+        checkpoint = torch.load(resume_path)
+        self.start_epoch = checkpoint['epoch'] + 1
+        self.mnt_best = checkpoint['monitor_best']
+        # load architecture params from checkpoint.
+        if checkpoint['config']['arch'] != self.config['arch']:
+            self.logger.warning("Warning: Architecture configuration given in config file is different from that of "
+                                "checkpoint. This may yield an exception while state_dict is being loaded.")
+        self.model.load_state_dict(checkpoint['state_dict'])
+        # load optimizer state from checkpoint only when optimizer type is not changed.
+        if checkpoint['config']['optimizer']['type'] != self.config['optimizer']['type']:
+            self.logger.warning("Warning: Optimizer type given in config file is different from that of checkpoint. "
+                                "Optimizer parameters not being resumed.")
+        else:
+            self.optimizer.load_state_dict(checkpoint['optimizer'])
+        self.logger.info("Checkpoint loaded. Resume training from epoch {}".format(self.start_epoch))

model/config.json ADDED Viewed

	@@ -0,0 +1,50 @@

+{
+    "name": "Mnist_LeNet",
+    "n_gpu": 1,
+    "arch": {
+        "type": "MnistModel",
+        "args": {}
+    },
+    "data_loader": {
+        "type": "MnistDataLoader",
+        "args":{
+            "data_dir": "data/",
+            "batch_size": 128,
+            "shuffle": true,
+            "validation_split": 0.1,
+            "num_workers": 2
+        }
+    },
+    "optimizer": {
+        "type": "Adam",
+        "args":{
+            "lr": 0.001,
+            "weight_decay": 0,
+            "amsgrad": true
+        }
+    },
+    "loss": "nll_loss",
+    "metrics": [
+        "accuracy", "top_k_acc"
+    ],
+    "lr_scheduler": {
+        "type": "StepLR",
+        "args": {
+            "step_size": 50,
+            "gamma": 0.1
+        }
+    },
+    "trainer": {
+        "epochs": 100,
+        "save_dir": "saved/",
+        "save_period": 1,
+        "verbosity": 2,
+        "monitor": "min val_loss",
+        "early_stop": 10,
+        "tensorboard": true
+    }
+}

model/data_loader/data_loaders.py ADDED Viewed

	@@ -0,0 +1,16 @@

+from torchvision import datasets, transforms
+from base import BaseDataLoader
+class MnistDataLoader(BaseDataLoader):
+    """
+    MNIST data loading demo using BaseDataLoader
+    """
+    def __init__(self, data_dir, batch_size, shuffle=True, validation_split=0.0, num_workers=1, training=True):
+        trsfm = transforms.Compose([
+            transforms.ToTensor(),
+            transforms.Normalize((0.1307,), (0.3081,))
+        ])
+        self.data_dir = data_dir
+        self.dataset = datasets.MNIST(self.data_dir, train=training, download=True, transform=trsfm)
+        super().__init__(self.dataset, batch_size, shuffle, validation_split, num_workers)

model/logger/__init__.py ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ from .logger import *
2	+ from .visualization import *

model/logger/logger.py ADDED Viewed

	@@ -0,0 +1,22 @@

+import logging
+import logging.config
+from pathlib import Path
+from utils import read_json
+def setup_logging(save_dir, log_config='logger/logger_config.json', default_level=logging.INFO):
+    """
+    Setup logging configuration
+    """
+    log_config = Path(log_config)
+    if log_config.is_file():
+        config = read_json(log_config)
+        # modify logging paths based on run config
+        for _, handler in config['handlers'].items():
+            if 'filename' in handler:
+                handler['filename'] = str(save_dir / handler['filename'])
+        logging.config.dictConfig(config)
+    else:
+        print("Warning: logging configuration file is not found in {}.".format(log_config))
+        logging.basicConfig(level=default_level)

model/logger/logger_config.json ADDED Viewed

	@@ -0,0 +1,32 @@

+{
+    "version": 1,
+    "disable_existing_loggers": false,
+    "formatters": {
+        "simple": {"format": "%(message)s"},
+        "datetime": {"format": "%(asctime)s - %(name)s - %(levelname)s - %(message)s"}
+    },
+    "handlers": {
+        "console": {
+            "class": "logging.StreamHandler",
+            "level": "DEBUG",
+            "formatter": "simple",
+            "stream": "ext://sys.stdout"
+            },
+        "info_file_handler": {
+            "class": "logging.handlers.RotatingFileHandler",
+            "level": "INFO",
+            "formatter": "datetime",
+            "filename": "info.log",
+            "maxBytes": 10485760,
+            "backupCount": 20, "encoding": "utf8"
+        }
+    },
+    "root": {
+        "level": "INFO",
+        "handlers": [
+            "console",
+            "info_file_handler"
+        ]
+    }
+}

model/logger/visualization.py ADDED Viewed

	@@ -0,0 +1,73 @@

+import importlib
+from datetime import datetime
+class TensorboardWriter():
+    def __init__(self, log_dir, logger, enabled):
+        self.writer = None
+        self.selected_module = ""
+        if enabled:
+            log_dir = str(log_dir)
+            # Retrieve vizualization writer.
+            succeeded = False
+            for module in ["torch.utils.tensorboard", "tensorboardX"]:
+                try:
+                    self.writer = importlib.import_module(module).SummaryWriter(log_dir)
+                    succeeded = True
+                    break
+                except ImportError:
+                    succeeded = False
+                self.selected_module = module
+            if not succeeded:
+                message = "Warning: visualization (Tensorboard) is configured to use, but currently not installed on " \
+                    "this machine. Please install TensorboardX with 'pip install tensorboardx', upgrade PyTorch to " \
+                    "version >= 1.1 to use 'torch.utils.tensorboard' or turn off the option in the 'config.json' file."
+                logger.warning(message)
+        self.step = 0
+        self.mode = ''
+        self.tb_writer_ftns = {
+            'add_scalar', 'add_scalars', 'add_image', 'add_images', 'add_audio',
+            'add_text', 'add_histogram', 'add_pr_curve', 'add_embedding'
+        }
+        self.tag_mode_exceptions = {'add_histogram', 'add_embedding'}
+        self.timer = datetime.now()
+    def set_step(self, step, mode='train'):
+        self.mode = mode
+        self.step = step
+        if step == 0:
+            self.timer = datetime.now()
+        else:
+            duration = datetime.now() - self.timer
+            self.add_scalar('steps_per_sec', 1 / duration.total_seconds())
+            self.timer = datetime.now()
+    def __getattr__(self, name):
+        """
+        If visualization is configured to use:
+            return add_data() methods of tensorboard with additional information (step, tag) added.
+        Otherwise:
+            return a blank function handle that does nothing
+        """
+        if name in self.tb_writer_ftns:
+            add_data = getattr(self.writer, name, None)
+            def wrapper(tag, data, *args, **kwargs):
+                if add_data is not None:
+                    # add mode(train/valid) tag
+                    if name not in self.tag_mode_exceptions:
+                        tag = '{}/{}'.format(tag, self.mode)
+                    add_data(tag, data, self.step, *args, **kwargs)
+            return wrapper
+        else:
+            # default action for returning methods defined in this class, set_step() for instance.
+            try:
+                attr = object.__getattr__(name)
+            except AttributeError:
+                raise AttributeError("type object '{}' has no attribute '{}'".format(self.selected_module, name))
+            return attr

model/model/loss.py ADDED Viewed

	@@ -0,0 +1,5 @@

+import torch.nn.functional as F
+def nll_loss(output, target):
+    return F.nll_loss(output, target)

model/model/metric.py ADDED Viewed

	@@ -0,0 +1,20 @@

+import torch
+def accuracy(output, target):
+    with torch.no_grad():
+        pred = torch.argmax(output, dim=1)
+        assert pred.shape[0] == len(target)
+        correct = 0
+        correct += torch.sum(pred == target).item()
+    return correct / len(target)
+def top_k_acc(output, target, k=3):
+    with torch.no_grad():
+        pred = torch.topk(output, k, dim=1)[1]
+        assert pred.shape[0] == len(target)
+        correct = 0
+        for i in range(k):
+            correct += torch.sum(pred[:, i] == target).item()
+    return correct / len(target)

model/model/model.py ADDED Viewed

	@@ -0,0 +1,22 @@

+import torch.nn as nn
+import torch.nn.functional as F
+from base import BaseModel
+class MnistModel(BaseModel):
+    def __init__(self, num_classes=10):
+        super().__init__()
+        self.conv1 = nn.Conv2d(1, 10, kernel_size=5)
+        self.conv2 = nn.Conv2d(10, 20, kernel_size=5)
+        self.conv2_drop = nn.Dropout2d()
+        self.fc1 = nn.Linear(320, 50)
+        self.fc2 = nn.Linear(50, num_classes)
+    def forward(self, x):
+        x = F.relu(F.max_pool2d(self.conv1(x), 2))
+        x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
+        x = x.view(-1, 320)
+        x = F.relu(self.fc1(x))
+        x = F.dropout(x, training=self.training)
+        x = self.fc2(x)
+        return F.log_softmax(x, dim=1)

model/new_project.py ADDED Viewed

	@@ -0,0 +1,18 @@

+import sys
+from pathlib import Path
+from shutil import copytree, ignore_patterns
+# This script initializes new pytorch project with the template files.
+# Run `python3 new_project.py ../MyNewProject` then new project named
+# MyNewProject will be made
+current_dir = Path()
+assert (current_dir / 'new_project.py').is_file(), 'Script should be executed in the pytorch-template directory'
+assert len(sys.argv) == 2, 'Specify a name for the new project. Example: python3 new_project.py MyNewProject'
+project_name = Path(sys.argv[1])
+target_dir = current_dir / project_name
+ignore = [".git", "data", "saved", "new_project.py", "LICENSE", ".flake8", "README.md", "__pycache__"]
+copytree(current_dir, target_dir, ignore=ignore_patterns(*ignore))
+print('New project initialized at', target_dir.absolute().resolve())

model/parse_config.py ADDED Viewed

	@@ -0,0 +1,157 @@

+import os
+import logging
+from pathlib import Path
+from functools import reduce, partial
+from operator import getitem
+from datetime import datetime
+from logger import setup_logging
+from utils import read_json, write_json
+class ConfigParser:
+    def __init__(self, config, resume=None, modification=None, run_id=None):
+        """
+        class to parse configuration json file. Handles hyperparameters for training, initializations of modules, checkpoint saving
+        and logging module.
+        :param config: Dict containing configurations, hyperparameters for training. contents of `config.json` file for example.
+        :param resume: String, path to the checkpoint being loaded.
+        :param modification: Dict keychain:value, specifying position values to be replaced from config dict.
+        :param run_id: Unique Identifier for training processes. Used to save checkpoints and training log. Timestamp is being used as default
+        """
+        # load config file and apply modification
+        self._config = _update_config(config, modification)
+        self.resume = resume
+        # set save_dir where trained model and log will be saved.
+        save_dir = Path(self.config['trainer']['save_dir'])
+        exper_name = self.config['name']
+        if run_id is None: # use timestamp as default run-id
+            run_id = datetime.now().strftime(r'%m%d_%H%M%S')
+        self._save_dir = save_dir / 'models' / exper_name / run_id
+        self._log_dir = save_dir / 'log' / exper_name / run_id
+        # make directory for saving checkpoints and log.
+        exist_ok = run_id == ''
+        self.save_dir.mkdir(parents=True, exist_ok=exist_ok)
+        self.log_dir.mkdir(parents=True, exist_ok=exist_ok)
+        # save updated config file to the checkpoint dir
+        write_json(self.config, self.save_dir / 'config.json')
+        # configure logging module
+        setup_logging(self.log_dir)
+        self.log_levels = {
+            0: logging.WARNING,
+            1: logging.INFO,
+            2: logging.DEBUG
+        }
+    @classmethod
+    def from_args(cls, args, options=''):
+        """
+        Initialize this class from some cli arguments. Used in train, test.
+        """
+        for opt in options:
+            args.add_argument(*opt.flags, default=None, type=opt.type)
+        if not isinstance(args, tuple):
+            args = args.parse_args()
+        if args.device is not None:
+            os.environ["CUDA_VISIBLE_DEVICES"] = args.device
+        if args.resume is not None:
+            resume = Path(args.resume)
+            cfg_fname = resume.parent / 'config.json'
+        else:
+            msg_no_cfg = "Configuration file need to be specified. Add '-c config.json', for example."
+            assert args.config is not None, msg_no_cfg
+            resume = None
+            cfg_fname = Path(args.config)
+        config = read_json(cfg_fname)
+        if args.config and resume:
+            # update new config for fine-tuning
+            config.update(read_json(args.config))
+        # parse custom cli options into dictionary
+        modification = {opt.target : getattr(args, _get_opt_name(opt.flags)) for opt in options}
+        return cls(config, resume, modification)
+    def init_obj(self, name, module, *args, **kwargs):
+        """
+        Finds a function handle with the name given as 'type' in config, and returns the
+        instance initialized with corresponding arguments given.
+        `object = config.init_obj('name', module, a, b=1)`
+        is equivalent to
+        `object = module.name(a, b=1)`
+        """
+        module_name = self[name]['type']
+        module_args = dict(self[name]['args'])
+        assert all([k not in module_args for k in kwargs]), 'Overwriting kwargs given in config file is not allowed'
+        module_args.update(kwargs)
+        return getattr(module, module_name)(*args, **module_args)
+    def init_ftn(self, name, module, *args, **kwargs):
+        """
+        Finds a function handle with the name given as 'type' in config, and returns the
+        function with given arguments fixed with functools.partial.
+        `function = config.init_ftn('name', module, a, b=1)`
+        is equivalent to
+        `function = lambda *args, **kwargs: module.name(a, *args, b=1, **kwargs)`.
+        """
+        module_name = self[name]['type']
+        module_args = dict(self[name]['args'])
+        assert all([k not in module_args for k in kwargs]), 'Overwriting kwargs given in config file is not allowed'
+        module_args.update(kwargs)
+        return partial(getattr(module, module_name), *args, **module_args)
+    def __getitem__(self, name):
+        """Access items like ordinary dict."""
+        return self.config[name]
+    def get_logger(self, name, verbosity=2):
+        msg_verbosity = 'verbosity option {} is invalid. Valid options are {}.'.format(verbosity, self.log_levels.keys())
+        assert verbosity in self.log_levels, msg_verbosity
+        logger = logging.getLogger(name)
+        logger.setLevel(self.log_levels[verbosity])
+        return logger
+    # setting read-only attributes
+    @property
+    def config(self):
+        return self._config
+    @property
+    def save_dir(self):
+        return self._save_dir
+    @property
+    def log_dir(self):
+        return self._log_dir
+# helper functions to update config dict with custom cli options
+def _update_config(config, modification):
+    if modification is None:
+        return config
+    for k, v in modification.items():
+        if v is not None:
+            _set_by_path(config, k, v)
+    return config
+def _get_opt_name(flags):
+    for flg in flags:
+        if flg.startswith('--'):
+            return flg.replace('--', '')
+    return flags[0].replace('--', '')
+def _set_by_path(tree, keys, value):
+    """Set a value in a nested object in tree by sequence of keys."""
+    keys = keys.split(';')
+    _get_by_path(tree, keys[:-1])[keys[-1]] = value
+def _get_by_path(tree, keys):
+    """Access a nested object in tree by sequence of keys."""
+    return reduce(getitem, keys, tree)

model/requirements.txt ADDED Viewed

	@@ -0,0 +1,5 @@

+torch>=1.1
+torchvision
+numpy
+tqdm
+tensorboard>=1.14

model/test.py ADDED Viewed

	@@ -0,0 +1,81 @@

+import argparse
+import torch
+from tqdm import tqdm
+import data_loader.data_loaders as module_data
+import model.loss as module_loss
+import model.metric as module_metric
+import model.model as module_arch
+from parse_config import ConfigParser
+def main(config):
+    logger = config.get_logger('test')
+    # setup data_loader instances
+    data_loader = getattr(module_data, config['data_loader']['type'])(
+        config['data_loader']['args']['data_dir'],
+        batch_size=512,
+        shuffle=False,
+        validation_split=0.0,
+        training=False,
+        num_workers=2
+    )
+    # build model architecture
+    model = config.init_obj('arch', module_arch)
+    logger.info(model)
+    # get function handles of loss and metrics
+    loss_fn = getattr(module_loss, config['loss'])
+    metric_fns = [getattr(module_metric, met) for met in config['metrics']]
+    logger.info('Loading checkpoint: {} ...'.format(config.resume))
+    checkpoint = torch.load(config.resume)
+    state_dict = checkpoint['state_dict']
+    if config['n_gpu'] > 1:
+        model = torch.nn.DataParallel(model)
+    model.load_state_dict(state_dict)
+    # prepare model for testing
+    device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
+    model = model.to(device)
+    model.eval()
+    total_loss = 0.0
+    total_metrics = torch.zeros(len(metric_fns))
+    with torch.no_grad():
+        for i, (data, target) in enumerate(tqdm(data_loader)):
+            data, target = data.to(device), target.to(device)
+            output = model(data)
+            #
+            # save sample images, or do something with output here
+            #
+            # computing loss, metrics on test set
+            loss = loss_fn(output, target)
+            batch_size = data.shape[0]
+            total_loss += loss.item() * batch_size
+            for i, metric in enumerate(metric_fns):
+                total_metrics[i] += metric(output, target) * batch_size
+    n_samples = len(data_loader.sampler)
+    log = {'loss': total_loss / n_samples}
+    log.update({
+        met.__name__: total_metrics[i].item() / n_samples for i, met in enumerate(metric_fns)
+    })
+    logger.info(log)
+if __name__ == '__main__':
+    args = argparse.ArgumentParser(description='PyTorch Template')
+    args.add_argument('-c', '--config', default=None, type=str,
+                      help='config file path (default: None)')
+    args.add_argument('-r', '--resume', default=None, type=str,
+                      help='path to latest checkpoint (default: None)')
+    args.add_argument('-d', '--device', default=None, type=str,
+                      help='indices of GPUs to enable (default: all)')
+    config = ConfigParser.from_args(args)
+    main(config)

model/train.py ADDED Viewed

	@@ -0,0 +1,73 @@

+import argparse
+import collections
+import torch
+import numpy as np
+import data_loader.data_loaders as module_data
+import model.loss as module_loss
+import model.metric as module_metric
+import model.model as module_arch
+from parse_config import ConfigParser
+from trainer import Trainer
+from utils import prepare_device
+# fix random seeds for reproducibility
+SEED = 123
+torch.manual_seed(SEED)
+torch.backends.cudnn.deterministic = True
+torch.backends.cudnn.benchmark = False
+np.random.seed(SEED)
+def main(config):
+    logger = config.get_logger('train')
+    # setup data_loader instances
+    data_loader = config.init_obj('data_loader', module_data)
+    valid_data_loader = data_loader.split_validation()
+    # build model architecture, then print to console
+    model = config.init_obj('arch', module_arch)
+    logger.info(model)
+    # prepare for (multi-device) GPU training
+    device, device_ids = prepare_device(config['n_gpu'])
+    model = model.to(device)
+    if len(device_ids) > 1:
+        model = torch.nn.DataParallel(model, device_ids=device_ids)
+    # get function handles of loss and metrics
+    criterion = getattr(module_loss, config['loss'])
+    metrics = [getattr(module_metric, met) for met in config['metrics']]
+    # build optimizer, learning rate scheduler. delete every lines containing lr_scheduler for disabling scheduler
+    trainable_params = filter(lambda p: p.requires_grad, model.parameters())
+    optimizer = config.init_obj('optimizer', torch.optim, trainable_params)
+    lr_scheduler = config.init_obj('lr_scheduler', torch.optim.lr_scheduler, optimizer)
+    trainer = Trainer(model, criterion, metrics, optimizer,
+                      config=config,
+                      device=device,
+                      data_loader=data_loader,
+                      valid_data_loader=valid_data_loader,
+                      lr_scheduler=lr_scheduler)
+    trainer.train()
+if __name__ == '__main__':
+    args = argparse.ArgumentParser(description='PyTorch Template')
+    args.add_argument('-c', '--config', default=None, type=str,
+                      help='config file path (default: None)')
+    args.add_argument('-r', '--resume', default=None, type=str,
+                      help='path to latest checkpoint (default: None)')
+    args.add_argument('-d', '--device', default=None, type=str,
+                      help='indices of GPUs to enable (default: all)')
+    # custom cli options to modify configuration from default values given in json file.
+    CustomArgs = collections.namedtuple('CustomArgs', 'flags type target')
+    options = [
+        CustomArgs(['--lr', '--learning_rate'], type=float, target='optimizer;args;lr'),
+        CustomArgs(['--bs', '--batch_size'], type=int, target='data_loader;args;batch_size')
+    ]
+    config = ConfigParser.from_args(args, options)
+    main(config)

model/trainer/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ from .trainer import *

model/trainer/trainer.py ADDED Viewed

	@@ -0,0 +1,110 @@

+import numpy as np
+import torch
+from torchvision.utils import make_grid
+from base import BaseTrainer
+from utils import inf_loop, MetricTracker
+class Trainer(BaseTrainer):
+    """
+    Trainer class
+    """
+    def __init__(self, model, criterion, metric_ftns, optimizer, config, device,
+                 data_loader, valid_data_loader=None, lr_scheduler=None, len_epoch=None):
+        super().__init__(model, criterion, metric_ftns, optimizer, config)
+        self.config = config
+        self.device = device
+        self.data_loader = data_loader
+        if len_epoch is None:
+            # epoch-based training
+            self.len_epoch = len(self.data_loader)
+        else:
+            # iteration-based training
+            self.data_loader = inf_loop(data_loader)
+            self.len_epoch = len_epoch
+        self.valid_data_loader = valid_data_loader
+        self.do_validation = self.valid_data_loader is not None
+        self.lr_scheduler = lr_scheduler
+        self.log_step = int(np.sqrt(data_loader.batch_size))
+        self.train_metrics = MetricTracker('loss', *[m.__name__ for m in self.metric_ftns], writer=self.writer)
+        self.valid_metrics = MetricTracker('loss', *[m.__name__ for m in self.metric_ftns], writer=self.writer)
+    def _train_epoch(self, epoch):
+        """
+        Training logic for an epoch
+        :param epoch: Integer, current training epoch.
+        :return: A log that contains average loss and metric in this epoch.
+        """
+        self.model.train()
+        self.train_metrics.reset()
+        for batch_idx, (data, target) in enumerate(self.data_loader):
+            data, target = data.to(self.device), target.to(self.device)
+            self.optimizer.zero_grad()
+            output = self.model(data)
+            loss = self.criterion(output, target)
+            loss.backward()
+            self.optimizer.step()
+            self.writer.set_step((epoch - 1) * self.len_epoch + batch_idx)
+            self.train_metrics.update('loss', loss.item())
+            for met in self.metric_ftns:
+                self.train_metrics.update(met.__name__, met(output, target))
+            if batch_idx % self.log_step == 0:
+                self.logger.debug('Train Epoch: {} {} Loss: {:.6f}'.format(
+                    epoch,
+                    self._progress(batch_idx),
+                    loss.item()))
+                self.writer.add_image('input', make_grid(data.cpu(), nrow=8, normalize=True))
+            if batch_idx == self.len_epoch:
+                break
+        log = self.train_metrics.result()
+        if self.do_validation:
+            val_log = self._valid_epoch(epoch)
+            log.update(**{'val_'+k : v for k, v in val_log.items()})
+        if self.lr_scheduler is not None:
+            self.lr_scheduler.step()
+        return log
+    def _valid_epoch(self, epoch):
+        """
+        Validate after training an epoch
+        :param epoch: Integer, current training epoch.
+        :return: A log that contains information about validation
+        """
+        self.model.eval()
+        self.valid_metrics.reset()
+        with torch.no_grad():
+            for batch_idx, (data, target) in enumerate(self.valid_data_loader):
+                data, target = data.to(self.device), target.to(self.device)
+                output = self.model(data)
+                loss = self.criterion(output, target)
+                self.writer.set_step((epoch - 1) * len(self.valid_data_loader) + batch_idx, 'valid')
+                self.valid_metrics.update('loss', loss.item())
+                for met in self.metric_ftns:
+                    self.valid_metrics.update(met.__name__, met(output, target))
+                self.writer.add_image('input', make_grid(data.cpu(), nrow=8, normalize=True))
+        # add histogram of model parameters to the tensorboard
+        for name, p in self.model.named_parameters():
+            self.writer.add_histogram(name, p, bins='auto')
+        return self.valid_metrics.result()
+    def _progress(self, batch_idx):
+        base = '[{}/{} ({:.0f}%)]'
+        if hasattr(self.data_loader, 'n_samples'):
+            current = batch_idx * self.data_loader.batch_size
+            total = self.data_loader.n_samples
+        else:
+            current = batch_idx
+            total = self.len_epoch
+        return base.format(current, total, 100.0 * current / total)

model/utils/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ from .util import *

model/utils/util.py ADDED Viewed

	@@ -0,0 +1,67 @@

+import json
+import torch
+import pandas as pd
+from pathlib import Path
+from itertools import repeat
+from collections import OrderedDict
+def ensure_dir(dirname):
+    dirname = Path(dirname)
+    if not dirname.is_dir():
+        dirname.mkdir(parents=True, exist_ok=False)
+def read_json(fname):
+    fname = Path(fname)
+    with fname.open('rt') as handle:
+        return json.load(handle, object_hook=OrderedDict)
+def write_json(content, fname):
+    fname = Path(fname)
+    with fname.open('wt') as handle:
+        json.dump(content, handle, indent=4, sort_keys=False)
+def inf_loop(data_loader):
+    ''' wrapper function for endless data loader. '''
+    for loader in repeat(data_loader):
+        yield from loader
+def prepare_device(n_gpu_use):
+    """
+    setup GPU device if available. get gpu device indices which are used for DataParallel
+    """
+    n_gpu = torch.cuda.device_count()
+    if n_gpu_use > 0 and n_gpu == 0:
+        print("Warning: There\'s no GPU available on this machine,"
+              "training will be performed on CPU.")
+        n_gpu_use = 0
+    if n_gpu_use > n_gpu:
+        print(f"Warning: The number of GPU\'s configured to use is {n_gpu_use}, but only {n_gpu} are "
+              "available on this machine.")
+        n_gpu_use = n_gpu
+    device = torch.device('cuda:0' if n_gpu_use > 0 else 'cpu')
+    list_ids = list(range(n_gpu_use))
+    return device, list_ids
+class MetricTracker:
+    def __init__(self, *keys, writer=None):
+        self.writer = writer
+        self._data = pd.DataFrame(index=keys, columns=['total', 'counts', 'average'])
+        self.reset()
+    def reset(self):
+        for col in self._data.columns:
+            self._data[col].values[:] = 0
+    def update(self, key, value, n=1):
+        if self.writer is not None:
+            self.writer.add_scalar(key, value)
+        self._data.total[key] += value * n
+        self._data.counts[key] += n
+        self._data.average[key] = self._data.total[key] / self._data.counts[key]
+    def avg(self, key):
+        return self._data.average[key]
+    def result(self):
+        return dict(self._data.average)