Summary
Often, in classification, we have to aggregate the experimental results of different experiments. While the summarization by averaging performance indicators is a valuable effort to standardize the evaluation procedure, it has no theoretical justification and it breaks the intrinsic relationships between summarized indicators. This leads to interpretation inconsistencies. In this paper, we present a theoretical approach to summarize the performances for multiple experiments that preserves the relationships between performance indicators. In addition, we give formulas and an algorithm to calculate summarized performances. Our first showcase was that of
CDNET (A video database for testing change detection algorithms) (seeÂ
3↓ hereafter).