Feature Extractor radtorch.extractor

`FeatureExtractor`

Feature Extractor performs feature extraction from images using a pytorch model pretrained on ImageNet. Features can be accessed using below attributes after running the feature extraction process through the .run() method.

Parameters:

Name	Type	Description	Default
`model_arch`	`str`	Model architecture to be used for feature extraction.	required
`dataset`	`ImageDataset`	`ImageDataset` to be used for training.	required
`subset`	`str`	the subset op the dataset to extract features from. Default: 'train'. Options: 'train', 'valid', 'test'.	`'train'`
`device`	`str`	Device to be used for training. Default: 'auto' which automtically detects GPU presence and uses it for feature extraction. Options: 'auto', 'cuda', 'cpu'.	`'auto'`

Attributes:

Name	Type	Description
`loader`	`pytorch dataloader object`	Training pytorch DataLoader.
`table`	`pandas dataframe`	the table of images to be used for feature extraction.
`model`	`pytorch neural network`	Instance of the pytorch model to be used for feature extraction.
`features`	`pandas dataframe`	table of extracted features.
`feature_table`	`pandas dataframe`	table of extracted features and corresponding uid for each image instance.
`feature_names`	`list`	names of the extracted features.

Examples:

import radtorch
import albumentations as A

ds = radtorch.data.ImageDataset('data/PROTOTYPE/FILE/', transforms={'train': A.Compose([A.Resize(64,64)])})

ext = radtorch.extractor.FeatureExtractor('vgg16', dataset=ds)
ext.run()

Methods

`hybrid_table(self, sklearn_ready=False, label_id=True)`

Use this method to create pandas dataframes of features and labels that can be used directly into training using scikit-learn.

Parameters:

Name	Type	Description	Default
`sklearn_ready`	`bool`	True returns a tuple of extracted features dataframe and labels dataframe. False returns table of features, uid, path, label and label_id.	`False`
`label_id`	`bool`	True returns the label ids (integer) instead of the label string.	`True`

`num_features(self)`

Returns the expected number of features to be extracted.

`run(self)`

Runs the feature extraction process