The name of the Server, StAR, stands for Statistical Analysis of receiver operating characteristic (ROC) Curves and it was developed to fulfill the frequent need we have at our laboratory for assesing the statistical significance of the observed difference between two binary classifiers. After performing an exhaustive search on this topic, we found that few tools were freely available. We started by using the software RockIt, which has two main drawbacks: 1) it runs only under Windows OS, and 2) when many classifiers needed to be compared in a pairwise fashion, it was annoying to have to generate each parameter file independently. The statistical package R was another alternative, and though it is powerful and flexible, it contains many other options and capabilities. In addition to this, this tool is not easy to use for non-experienced users without a given computer programming background, which is the case of many undergraduate students.
The StAR server computes ROC curves and several related statistics to assess the significance of their differences in performance. The server and its standalone version for the LINUX OS are focused on a single problem and are easy to use.
The server relies on a non-parametric test for the difference of the area under the ROC curves (AUC) that accounts for the correlation of the ROC curves 1. This test takes advantage of the equality between the Mann-Whitney U-statistic for comparing distributions and the AUC when computed by the trapezoidal rule. A chi-square statistic is built and used to compute a p-value for the difference of the AUCs measured. Also, a confidence interval at 95% confidence level is computed for the difference of each pair of AUC. If you use this tool in your research, please acknowledge it by citing the following article:
Vergara, I.A., Norambuena, T., Ferrada, E., Slater, A.W. and Melo, F. (2008) StAR: a simple tool for the statistical comparison of ROC curves. BMC B
ioinformatics 9, 265.
We expect that the StAR server and its standalone version will be useful to a wide range of scientists working (i.e. developing and/or testing) with binary classifiers in many different areas.
1. DeLong, E.R., DeLong, D.M. and Clarke-Pearson, D.L. (1988) Comparing the Areas Under Two or More Correlated Receiver Operating Characteristic Curves: A Nonparametric Approach. Biometrics 44, 837-845.