Deep Neural Network with Caffe models

Load Caffe framework models.

In this tutorial you will learn how to use DNN module for image classification by using GoogLeNet trained network from Caffe model zoo.

Sources:

Contents

BVLC GoogLeNet

First, we download GoogLeNet model files:

dirDNN = fullfile(mexopencv.root(), 'test', 'dnn', 'GoogLeNet');
modelLabels = fullfile(dirDNN, 'synset_words.txt');
modelTxt = fullfile(dirDNN, 'deploy.prototxt');
modelBin = fullfile(dirDNN, 'bvlc_googlenet.caffemodel');  % 51 MB file
files = {modelLabels, modelTxt, modelBin};
urls = {
    'https://cdn.rawgit.com/opencv/opencv/3.4.0/samples/data/dnn/synset_words.txt';
    'https://cdn.rawgit.com/opencv/opencv/3.4.0/samples/data/dnn/bvlc_googlenet.prototxt';
    'http://dl.caffe.berkeleyvision.org/bvlc_googlenet.caffemodel';
};
if ~isdir(dirDNN), mkdir(dirDNN); end
for i=1:numel(files)
    if exist(files{i}, 'file') ~= 2
        disp('Downloading...')
        urlwrite(urls{i}, files{i});
    end
end

Load class labels

if ~mexopencv.isOctave()
    fid = fopen(modelLabels, 'rt');
    C = textscan(fid, '%*s %[^\n]');
    fclose(fid);
    labels = C{1};
else
    %HACK: textscan is buggy and unreliable in Octave!
    labels = textread(modelLabels, '%s', 'Delimiter','\n');
    labels = regexprep(labels, '^\w+\s*', '', 'once');
end
fprintf('%d classes\n', numel(labels));
1000 classes

Create and initialize network from Caffe model

net = cv.Net('Caffe', modelTxt, modelBin);
assert(~net.empty(), 'Cant load network');
if false
    net.setPreferableTarget('OpenCL');
end

Prepare blob from input image

Read input image (BGR channel order)

img = cv.imread(fullfile(mexopencv.root(), 'test', 'space_shuttle.jpg'), ...
    'Color',true, 'FlipChannels',false);

we resize and convert the image to 4-dimensional blob (so-called batch) with 1x3x224x224 shape, because GoogLeNet accepts only 224x224 BGR-images. we also subtract the mean pixel value of the training dataset ILSVRC_2012 (B: 104.0069879317889, G: 116.66876761696767, R: 122.6789143406786)

if true
    blob = cv.Net.blobFromImages(img, ...
        'Size',[224 224], 'Mean',[104 117 123], 'SwapRB',false);
else
    % NOTE: blobFromImages does crop/resize to preserve aspect ratio of image
    blob = cv.resize(img, [224 224]);                % Size
    blob = single(blob);                             % CV_32F
    blob = bsxfun(@minus, blob, cat(3,104,117,123)); % Mean (BGR)
    blob = permute(blob, [4 3 1 2]);                 % HWCN -> NCHW
end

Set the network input

In deploy.prototxt the network input blob named as "data". Other blobs labeled as "name_of_layer.name_of_layer_output".

net.setInput(blob, 'data');

Make forward pass and compute output

During the forward pass output of each network layer is computed, but in this example we need output from "prob" layer only.

tic
p = net.forward('prob');  % 1x1000 vector
toc
Elapsed time is 0.360979 seconds.

Gather output of "prob" layer

We determine the best class by taking the output of "prob" layer, which contain probabilities for each of 1000 ILSVRC2012 image classes, and finding the index of element with maximal value in this one. This index correspond to the class of the image.

[~,ord] = sort(p, 'descend');  % ordered by maximum probability

Show predictions

if ~mexopencv.isOctave()
    %HACK: TABLE not implemented in Octave
    t = table(labels(:), p(:), 'VariableNames',{'Label','Probability'});
    t = sortrows(t, 'Probability', 'descend');
    disp('Top 5 predictions');
    disp(t(1:5,:));
end

subplot(3,1,1:2), imshow(flip(img,3))
title('space_shuttle.jpg', 'Interpreter','none')
subplot(3,1,3), barh(1:5, p(ord(1:5))*100)
set(gca, 'YTickLabel',labels(ord(1:5)), 'YDir','reverse')
axis([0 100 0 6]), grid on
title('Top-5 Predictions'), xlabel('Probability (%)')
Top 5 predictions
                  Label                  Probability
    _________________________________    ___________
    'space shuttle'                         0.99993 
    'airliner'                           2.3053e-05 
    'submarine, pigboat, sub, U-boat'    2.1932e-05 
    'bullet train, bullet'               1.0001e-05 
    'missile'                             4.962e-06