基于分位数半径动态K-means的分布式负荷聚类算法

刘季昂; 刘友波; 程明畅; 余莉娜

引用本文:	刘季昂,刘友波,程明畅,等.基于分位数半径动态K-means的分布式负荷聚类算法[J].电力系统保护与控制,2019,47(24):15-22.
	LIU Ji’ang,LIU Youbo,CHENG Mingchang,et al.A distributed load clustering algorithm based on quantile radius dynamic K-means[J].Power System Protection and Control,2019,47(24):15-22

【打印本页】【下载PDF全文】【查看/发表评论】【EndNote】【RefMan】【BibTex】

←前一篇|后一篇→

过刊浏览高级检索

本文已被：浏览 4743次下载 2197次	码上扫一扫！
分享到：微信更多字体:加大+\|默认\|缩小-
基于分位数半径动态K-means的分布式负荷聚类算法
刘季昂¹,刘友波¹,程明畅²,余莉娜³
(1.四川大学电气信息学院，四川成都 610065;2.西南财经大学统计学院，四川成都 611130; 3.中国三峡新能源有限公司西南分公司，四川成都 610023)

摘要:

针对电力负荷曲线聚类中传统的K-means算法对初始值敏感以及需给定类数目的缺陷，将一种基于分位数半径的动态K-means算法应用到日负荷曲线的聚类分析中，并进行了分布式的改进以优化计算效率。此算法结合了两种思想:分布式聚类中的局部聚类与全局聚类，以及层次K-means中以多次k取定值时K-means运算所得到的中心点来表示该类。将多次的K-means运算分配到不同子站点，并使每次K-means运算中k不断改变。再从类的几何特征出发，引入了分位数半径的概念，规定样本点与各类中心点间距的分位数表示该类的半径，于主站点中对各类的中心点间距与类的半径进行大小比较，并进行筛选融合来获得新的类，从而实现较为快速地识别类数目，并且得到新的聚类初始中心与结果。最终以某地区606个用户某月的日负荷数据为研究对象，验证了该算法在电力负荷曲线聚类分析中的有效性。

关键词: 电力大数据聚类分析负荷曲线聚类分位数半径分布式聚类

DOI：10.19783/j.cnki.pspc.190144

分类号:

基金项目:国家重点研发技术项目资助(2017YFE0112600)

A distributed load clustering algorithm based on quantile radius dynamic K-means

LIU Ji’ang¹,LIU Youbo¹,CHENG Mingchang²,YU Lina³

(1. School of Electrical Engineering and Information, Sichuan University, Chengdu 610065, China;2. School of Statistic, Southwestern University of Finance and Economics, Chengdu 611130, China;3. China Three Gorges New Energy Limited Company Southwest Branch, Chengdu 610023, China)

Abstract:

Aiming at the defects of traditional K-means algorithm in the power load curves clustering, such as sensitive to initial values and given the number of clusters, a dynamic K-means algorithm based on quantile radius is applied to the clustering analysis of daily load curves, and distributed improvement is made to optimize the computational efficiency. This algorithm combines two ideas:local clustering and global clustering in distributed clustering, and the central point obtained by K-means operation when k is repeatedly set as the fixed value for many times in the hierarchical K-means. Multiple K-means operations are assigned to different subsites, and the k-value of each K-means operation is changed. Then from the geometrical characteristics of the clusters, the concept of quantile radius is introduced. Quantile radius defines that the quantile of the distance between the sample point and the cluster center point represents the radius of the cluster. At the main site, the distance between the center points of each cluster with the quantile radius of the cluster is compared to filter and merge to get new clusters, so that the number of clusters can be quickly identified and a good initial center and result of the cluster are given. Finally, the daily load data of 606 users in a certain area and in a certain month is taken as the research object, and the effectiveness of the algorithm in the cluster analysis of power load curves is verified. This work is supported by National Key Research and Development Program of China (No. 2017YFE0112600).

Key words: power big data cluster analysis load curves clustering quantile radius distributed clustering

X关闭