A Modeling Approach Integrating Data Envelopment Analysis and the K-means Algorithm for Performance Evaluation and Clustering of Decision Making Units

Authors

  • Mohsen Rostamy-Malkhalifeh Science and Research Branch, Islamic Azad University, Tehran, Iran
  • Reza Ghasempour Feremi

Abstract

In increasingly competitive and resource-constrained environments, performance evaluation plays a vital role in improving organizational efficiency and supporting data-driven decision-making. Traditional Data Envelopment Analysis (DEA), while widely adopted, often struggles to account for structural heterogeneity among Decision-Making Units (DMUs), leading to biased comparisons and limited managerial insight. This study addresses these limitations by proposing a hybrid analytical framework that integrates K-Means clustering, DEA modeling, and performance gap analysis to enhance the accuracy, fairness, and practicality of efficiency assessments. The proposed methodology begins with unsupervised clustering using the K-Means algorithm to segment DMUs into operationally homogeneous groups based on input and output characteristics. DEA is then applied independently within each cluster using the input-oriented CCR model under constant returns to scale (CRS). To improve discriminatory power, super-efficiency and cross-efficiency evaluations are performed to rank efficient units and assess mutual performance consensus. Finally, a performance gap analysis identifies quantitative input reductions and output enhancement targets for inefficient units, offering concrete guidelines for improvement at both the unit and system levels. The empirical results demonstrate that clustering prior to DEA significantly improves the validity of benchmarking by isolating peer comparisons within structurally similar groups. Super-efficiency scores reveal meaningful variation among efficient DMUs, while cross-efficiency results provide a more socially consistent view of performance. Moreover, the gap analysis highlights the potential for a 14% reduction in workforce, 11% optimization of operational costs, and up to an 18% increase in total outputs through more effective resource utilization. These findings underscore the managerial relevance of the proposed model and its potential application across various sectors where performance and resource optimization are critical.

Published

2026-02-13

Issue

Section

Vol. 20, No. 2, (2026)