Skip to content

Classifier radtorch.classifier

ImageClassifier

Class for image classifier. This class acts as wrapper to train a selected model (either pytorch neural network or a sklearn classifier) using a dataset which can be either a radtorch ImageDataset or VolumeDataset.

Optionally, a specific train and validation pytorch dataloaders may be manually specified instead of using radtorch dataset objects.

Training a Pytroch Neural Network

If the model to train is a pytorch neural network, in addition to the model, ImageClassifier expects a pytorch criterion/loss function, a pytorch optimizer and an optional pytorch scheduler.

Training an sklearn classifier

If the model to be trained is an sklearn classifier, ImageClassifier performs feature extraction followed by training the model. Accordingly, ImageClassifier expects a model architecture for the feature extraction process.

Creating multiple classifier objects using same model/neural network object

To ensure results consistency, a new instance of pytorch model/neural network object MUST be instatiated with every classifier object.

For example, Do this:

M =radtorch.model.Model(model_arch='vgg16', in_channels=1, out_classes=2)
clf = radtorch.classifier.ImageClassifier(model=M, dataset=ds)
clf.fit(epochs=3)

M =radtorch.model.Model(model_arch='vgg16', in_channels=1, out_classes=2)
clf2 = radtorch.classifier.ImageClassifier(model=M, dataset=ds)
clf2.fit(epochs=3)

and Do NOT do this :

M =radtorch.model.Model(model_arch='vgg16', in_channels=1, out_classes=2)

clf = radtorch.classifier.ImageClassifier(model=M, dataset=ds)
clf.fit(epochs=3)

clf2 = radtorch.classifier.ImageClassifier(model=M, dataset=ds)
clf2.fit(epochs=3)

Parameters:

Name Type Description Default
name str

Name to be give to the Image Classifier. If none provided, the current date and time will be used to created a generic classifier name.

None
model pytroch neural network or sklearn classifier

Model to be trained.

None
dataset ImageDataset or VolumeDataset

ImageDataset or VolumeDataset to be used for training.

None
dataloader_train pytorch dataloader

Optional Training pytorch DataLoader

None
dataloader_valid pytorch dataloader

Optional Validation pytorch DataLoader

None
device str

Device to be used for training.

'auto'
feature_extractor_arch str

Architecture of the model to be used for feature extraction when training sklearn classifier. See (https://pytorch.org/vision/0.8/models.html#classification)[https://pytorch.org/vision/0.8/models.html#classification]

'vgg16'
criterion pytorch loss function

Loss function to be used during training a pytorch neural network.

None
optimizer pytorch optimizer

Loss function to be used during training a pytorch neural network.

None
scheduler pytorch scheduler

Scheduler to be used during training a pytorch neural network.

None
scheduler metric (str

when using ReduceLROnPlateau pytorch scheduler, a target loss or accuracy must be provided to monitor. Options: 'train_loss', 'train_accuracy', 'valid_loss', 'valid_accuracy'.

None
use_checkpoint bool

Path (str) to a saved checkpoint to continue training. If a checkpoint is used to resume training, training will be resumed from saved checkpoint to new/specified epoch number.

False
random_seed int

Random seed (default=100)

0

Using manual pytorch dataloaders

If maually created dataloaders are used, set dataset to None.

Selecting device for training

Auto mode automatically detects if there is GPU utilizes it for training.

Attributes:

Name Type Description
type str

Type of the classifier model to be trained.

train_losses list

List of train losses recorded. Length = Number of epochs.

valid_losses list

List of validation losses recorded. Length = Number of epochs.

train_acc list

List of train accuracies recorded. Length = Number of epochs.

valid_acc list

List of validation accuracies recorded. Length = Number of epochs.

valid_loss_min float

Minimum Validation Loss to save checkpoint.

best_model pytroch neural network or sklearn classifier

Best trained model with lowest Validation Loss in case of pytorch neural networks or the trained classifier for sklearn classifiers.

train_logs pandas dataframe

Table/Dataframe with all train/validation losses.

Methods

fit(self, **kwargs)

Trains the ImageClassifier object.

Training a Model

All the following arguments, except auto_save_ckpt and random_seed, apply only when training a pytorch neural network model. Training sklearn classifier does not need arguments.

Parameters:

Name Type Description Default
epochs int

Number of training epochs (default: 20).

required
valid bool

True to perform validation after each train step. False to only train on training dataset without validation. (default: True)

required
print_every int

Number of epochs after which print results. (default: 1)

required
target_valid_loss float / string

Minimum value to automatically save trained model afterwards. If 'lowest' is used, with every epoch , if the validation loss is less than minimum, then new best model is saved in checkpoint. Accepts maunally specified float minimum loss. (default: 'lowest')

required
auto_save_ckpt bool

Automatically save chekpoints. If True, a checkpoint file is saved. Please see below. (default: False)

required
random_seed int

Random seed. (default: 100)

required
verbose int

Verbose level during training. Options: 0, 1, 2. (default: 2)

required

Using auto_save_ckpt

If auto_save_ckpt is True, whenever training target is achieved, a new checkpoint will be saved.

The checkpoint file name = ImageClassifier.name+'epoch'+str(current_epoch)+'.checkpoint'

e.g. If the checkpoint is saved at epoch 10 for an ImageClassifier named clf, the checkpoint file will be named: clf_epoch_10.chekpoint

Resuming training using a saved checkpoint file

When using a saved checkpoint to resume training, a new instance of the Model/Pytorch Model and ImageClassifier should be instantiated.

For example:

# Intial Training

M =radtorch.model.Model(model_arch='vgg16', in_channels=1, out_classes=2)
clf = radtorch.classifier.ImageClassifier(M, dataset)
clf.fit(auto_save_ckpt=True, epochs=5, verbose=3) # Saves the best checkpoint automatically

# Resume Training

M =radtorch.model.Model(model_arch='vgg16', in_channels=1, out_classes=2)
clf2 = radtorch.classifier.ImageClassifier(M, dataset, use_checkpoint='saved_ckpt.checkpoint')
clf2.fit(auto_save_ckpt=False, epochs=5, verbose=3)

Checkpoint Files

A checkpoint file is a dictionary of:

  1. timestamp: Timestamp when saving the checkpoint.

  2. type: ImageClassifier type.

  3. classifier: ImageClassifier object.

  4. epochs: Total epochs specified on initial training.

  5. current_epoch: Current epoch when checkpoint was saved.

  6. optimizer_state_dict: Current state of Optimizer.

  7. train_losses: List of train losses recorded

  8. valid_losses: List of validation losses recorded

  9. valid_loss_min: Min Validation loss - See above.

info(self)

Displays all information about the ImageClassifier object.

Back to top