Feature Extractor radtorch.extractor
FeatureExtractor
Feature Extractor
performs feature extraction from images using a pytorch model
pretrained on ImageNet.
Features can be accessed using below attributes after running the feature extraction process through the .run()
method.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
model_arch |
str |
Model architecture to be used for feature extraction. |
required |
dataset |
ImageDataset |
|
required |
subset |
str |
the subset op the dataset to extract features from. Default: 'train'. Options: 'train', 'valid', 'test'. |
'train' |
device |
str |
Device to be used for training. Default: 'auto' which automtically detects GPU presence and uses it for feature extraction. Options: 'auto', 'cuda', 'cpu'. |
'auto' |
Attributes:
Name | Type | Description |
---|---|---|
loader |
pytorch dataloader object |
Training pytorch DataLoader. |
table |
pandas dataframe |
the table of images to be used for feature extraction. |
model |
pytorch neural network |
Instance of the pytorch model to be used for feature extraction. |
features |
pandas dataframe |
table of extracted features. |
feature_table |
pandas dataframe |
table of extracted features and corresponding uid for each image instance. |
feature_names |
list |
names of the extracted features. |
Examples:
import radtorch
import albumentations as A
ds = radtorch.data.ImageDataset('data/PROTOTYPE/FILE/', transforms={'train': A.Compose([A.Resize(64,64)])})
ext = radtorch.extractor.FeatureExtractor('vgg16', dataset=ds)
ext.run()
Methods
hybrid_table(self, sklearn_ready=False, label_id=True)
Use this method to create pandas dataframes of features and labels that can be used directly into training using scikit-learn.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
sklearn_ready |
bool |
True returns a tuple of extracted features dataframe and labels dataframe. False returns table of features, uid, path, label and label_id. |
False |
label_id |
bool |
True returns the label ids (integer) instead of the label string. |
True |
num_features(self)
Returns the expected number of features to be extracted.
run(self)
Runs the feature extraction process