attempts:指定K-Means算法执行的次数,每次算法执行的结果是不一样的,选择最好的那次结果输出;
centers:为cv::Mat类型,输出最终的均值点,mat.cols=特征长度,mat.rols=
测试代码如下:
#include 'opencv.hpp'
#include
#include
#include
#include
#include
#include
#include 'common.hpp'
/ K-Means ///
int test_opencv_kmeans()
{
const std::string image_path{ 'E:/GitCode/NN_Test/data/images/digit/handwriting_0_and_1/' };
cv::Mat tmp = cv::imread(image_path + '0_1.jpg', 0);
CHECK(tmp.data != nullptr && tmp.channels() == 1);
const int samples_number{ 80 }, every_class_number{ 20 }, categories_number{ samples_number / every_class_number};
cv::Mat samples_data(samples_number, tmp.rows * tmp.cols, CV_32FC1);
cv::Mat labels(samples_number, 1, CV_32FC1);
float* p1 = reinterpret_cast(labels.data);
for (int i = 1; i <= every_class_number; ++i) {
static const std::vector digit{ '0_', '1_', '2_', '3_' };
CHECK(digit.size() == categories_number);
static const std::string suffix{ '.jpg' };
for (int j = 0; j < categories_number; ++j) {
std::string image_name = image_path + digit[j] + std::to_string(i) + suffix;
cv::Mat image = cv::imread(image_name, 0);
CHECK(!image.empty() && image.channels() == 1);
image.convertTo(image, CV_32FC1);
image = image.reshape(0, 1);
tmp = samples_data.rowRange((i - 1) * categories_number + j, (i - 1) * categories_number + j + 1);
image.copyTo(tmp);
p1[(i - 1) * categories_number + j] = j;
}
}
const int K{ 4 }, attemps{ 100 };
const cv::TermCriteria term_criteria = cv::TermCriteria(cv::TermCriteria::EPS + cv::TermCriteria::COUNT, 100, 0.01);
cv::Mat labels_, centers_;
double value = cv::kmeans(samples_data, K, labels_, term_criteria, attemps, cv::KMEANS_RANDOM_CENTERS, centers_);
fprintf(stdout, 'K = %d, attemps = %d, iter count = %d, compactness measure = %f
',
K, attemps, term_criteria.maxCount, value);
CHECK(labels_.rows == samples_number);
int* p2 = reinterpret_cast(labels_.data);
for (int i = 1; i <= every_class_number; ++i) {
for (int j = 0; j < categories_number; ++j) {
fprintf(stdout, ' %d ', *p2++);
}
fprintf(stdout, '
');
}
return 0;
}
一般当attempts和TermCriteria中迭代次数值越大时,聚类效果越好。
bestLabels:为cv::Mat类型,是一个长度为(样本数,的矩阵,即mat.cols=1,mat.rows=样本数;为K-Means算法的结果输出,指定每一个样本聚类到哪一个label中;
flags:初始化均值点的方法,目前支持三种:KMEANS_RANDOM_CENTERS、KMEANS_PP_CENTERS、KMEANS_USE_INITIAL_LABELS;
data:为cv::Mat类型,每行代表一个样本,即特征,即mat.cols=特征长度,mat.rows=样本数,数据类型仅支持float;
criteria:TermCriteria类,算法进行迭代时终止的条件,可以指定最大迭代次数,也可以指定预期的精度,也可以这两种同时指定;
K:指定聚类时划分为几类;
文章为作者独立观点,不代表股票交易接口观点