Skip to content

Feature Extractor radtorch.extractor

FeatureExtractor

Feature Extractor performs feature extraction from images using a pytorch model pretrained on ImageNet. Features can be accessed using below attributes after running the feature extraction process through the .run() method.

Parameters:

Name Type Description Default
model_arch str

Model architecture to be used for feature extraction.

required
dataset ImageDataset

ImageDataset to be used for training.

required
subset str

the subset op the dataset to extract features from. Default: 'train'. Options: 'train', 'valid', 'test'.

'train'
device str

Device to be used for training. Default: 'auto' which automtically detects GPU presence and uses it for feature extraction. Options: 'auto', 'cuda', 'cpu'.

'auto'

Attributes:

Name Type Description
loader pytorch dataloader object

Training pytorch DataLoader.

table pandas dataframe

the table of images to be used for feature extraction.

model pytorch neural network

Instance of the pytorch model to be used for feature extraction.

features pandas dataframe

table of extracted features.

feature_table pandas dataframe

table of extracted features and corresponding uid for each image instance.

feature_names list

names of the extracted features.

Examples:

import radtorch
import albumentations as A

ds = radtorch.data.ImageDataset('data/PROTOTYPE/FILE/', transforms={'train': A.Compose([A.Resize(64,64)])})

ext = radtorch.extractor.FeatureExtractor('vgg16', dataset=ds)
ext.run()

Methods

hybrid_table(self, sklearn_ready=False, label_id=True)

Use this method to create pandas dataframes of features and labels that can be used directly into training using scikit-learn.

Parameters:

Name Type Description Default
sklearn_ready bool

True returns a tuple of extracted features dataframe and labels dataframe. False returns table of features, uid, path, label and label_id.

False
label_id bool

True returns the label ids (integer) instead of the label string.

True

num_features(self)

Returns the expected number of features to be extracted.

run(self)

Runs the feature extraction process

Back to top