加载Caffe框架模型
介绍
在本教程中,您将学习如何使用OpenCv_dnn模块进行图像分类,方法是使用来自Caffe模型动物园的 GoogLeNet训练网络。
我们将在下面的图片中展示这个例子的结果。
布兰航天飞机
源代码
我们将使用示例应用程序的片段,可以在这里下载。
#include <opencv2/dnn.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/core/utils/trace.hpp>
using namespace cv;
using namespace cv::dnn;
#include <fstream>
#include <iostream>
#include <cstdlib>
using namespace std;
/* Find best class for the blob (i. e. class with maximal probability) */
static void getMaxClass(const Mat &probBlob, int *classId, double *classProb)
{
Mat probMat = probBlob.reshape(1, 1); //reshape the blob to 1x1000 matrix
Point classNumber;
minMaxLoc(probMat, NULL, classProb, NULL, &classNumber);
*classId = classNumber.x;
}
static std::vector<String> readClassNames(const char *filename = "synset_words.txt")
{
std::vector<String> classNames;
std::ifstream fp(filename);
if (!fp.is_open())
{
std::cerr << "File with classes labels not found: " << filename << std::endl;
exit(-1);
}
std::string name;
while (!fp.eof())
{
std::getline(fp, name);
if (name.length())
classNames.push_back( name.substr(name.find(' ')+1) );
}
fp.close();
return classNames;
}
int main(int argc, char **argv)
{
CV_TRACE_FUNCTION();
String modelTxt = "bvlc_googlenet.prototxt";
String modelBin = "bvlc_googlenet.caffemodel";
String imageFile = (argc > 1) ? argv[1] : "space_shuttle.jpg";
Net net;
try {
net = dnn::readNetFromCaffe(modelTxt, modelBin);
}
catch (cv::Exception& e) {
std::cerr << "Exception: " << e.what() << std::endl;
if (net.empty())
{
std::cerr << "Can't load network by using the following files: " << std::endl;
std::cerr << "prototxt: " << modelTxt << std::endl;
std::cerr << "caffemodel: " << modelBin << std::endl;
std::cerr << "bvlc_googlenet.caffemodel can be downloaded here:" << std::endl;
std::cerr << "http://dl.caffe.berkeleyvision.org/bvlc_googlenet.caffemodel" << std::endl;
exit(-1);
}
}
Mat img = imread(imageFile);
if (img.empty())
{
std::cerr << "Can't read image from the file: " << imageFile << std::endl;
exit(-1);
}
//GoogLeNet accepts only 224x224 BGR-images
Mat inputBlob = blobFromImage(img, 1.0f, Size(224, 224),
Scalar(104, 117, 123), false); //Convert Mat to batch of images
Mat prob;
cv::TickMeter t;
for (int i = 0; i < 10; i++)
{
CV_TRACE_REGION("forward");
net.setInput(inputBlob, "data"); //set the network input
t.start();
prob = net.forward("prob"); //compute output
t.stop();
}
int classId;
double classProb;
getMaxClass(prob, &classId, &classProb);//find the best class
std::vector<String> classNames = readClassNames();
std::cout << "Best class: #" << classId << " '" << classNames.at(classId) << "'" << std::endl;
std::cout << "Probability: " << classProb * 100 << "%" << std::endl;
std::cout << "Time: " << (double)t.getTimeMilli() / t.getCounter() << " ms (average from " << t.getCounter() << " iterations)" << std::endl;
return 0;
} //main
说明
- 首先,下载GoogLeNet模型文件:bvlc_googlenet.prototxt和bvlc_googlenet.caffemodel还需要使用名称为ILSVRC2012类的文件:synset_words.txt。将这些文件放入此程序示例的工作目录中。
- 使用.prototxt和.caffemodel文件的路径读取并初始化网络
net = dnn :: readNetFromCaffe(modelTxt,modelBin);
- 检查网络是否已成功读取
if (net.empty())
{
std::cerr << "Can't load network by using the following files: " << std::endl;
std::cerr << "prototxt: " << modelTxt << std::endl;
std::cerr << "caffemodel: " << modelBin << std::endl;
std::cerr << "bvlc_googlenet.caffemodel can be downloaded here:" << std::endl;
std::cerr << "http://dl.caffe.berkeleyvision.org/bvlc_googlenet.caffemodel" << std::endl;
exit(-1);
}
- 读取输入图像并转换为Blob,可由GoogleNet接受
Mat img = imread(imageFile);
if (img.empty())
{
std::cerr << "Can't read image from the file: " << imageFile << std::endl;
exit(-1);
}
//GoogLeNet accepts only 224x224 BGR-images
Mat inputBlob = blobFromImage(img, 1.0f, Size(224, 224),
Scalar(104, 117, 123), false); //Convert Mat to batch of images
首先,我们调整图像的大小并改变其频道序列顺序。
现在图像实际上是一个具有224x224x3形状的三维数组。
接下来,我们通过使用特殊的cv :: dnn :: blobFromImages构造函数将图像转换为具有1x3x224x224形状的4维blob(所谓批处理)。
- 将blob传递到网络
net.setInput(inputBlob, "data"); //set the network input
在bvlc_googlenet.prototxt中,网络输入blob命名为“data”,因此这个blob在opencv_dnn API中标记为“.data”。
其他标记为“name_of_layer.name_of_layer_output”的blob。
- Make forward pass
prob = net.forward("prob"); //compute output
在计算每个网络层的正向传输输出期间,但在本例中,我们仅需要“prob”层的输出。
- 确定最好的class
int classId;
double classProb;
getMaxClass(prob, &classId, &classProb);//find the best class
我们把包含1000个ILSVRC2012图像类别的概率的“prob”层的输出放到prob
blob上。并在此找到具有最大值的元素的索引。该索引对应于图像的类。
- 打印结果
std::vector<String> classNames = readClassNames();
std::cout << "Best class: #" << classId << " '" << classNames.at(classId) << "'" << std::endl;
std::cout << "Probability: " << classProb * 100 << "%" << std::endl;
对于我们的形象我们得到:
Best class: #812 'space shuttle'
Probability: 99.6378%