In PROPER the scoring output of ranking classifiers is translated into a binary class decision by applying a spectrum of cutoffs. Usually no specific cutoff can optimally satisfy all possible performance criteria, hence cutoff choice involves a trade-off between different measures. Typically, a trade-off between a pair of measures (e.g. precision versus recall) is visualized as a cutoff-parametrized curve in the plane spanned by the two measures. Many machine learning and statistical learning packages, e.g. Weka [3] and SLEP [1] are available, but none of them offers standardized comprehensive optimizing, comparison, and performance evaluation of biological classifiers. As no cutoff is optimal according to all possible performance criteria, PROPER allows plotting cutoff-parameterized performance curves for any pair of more than 13 predictors’ performance measures and may also plot three-dimensional performance curves by combining three different performance measures in a 3D graph where each facet represents a standard performance curve (Fig. 2).
Calculated performance measures used to comprehensively evaluate the performance of classifiers include:
$$ T= True\ Predictions=\left( True\ Positives+ True\ Negatives\right) $$
$$ F= False\ Predictions=\left( False\ Positives+ False Negatives\right) $$
$$ TPR= Sensitivity= Recall=\kern0.75em \frac{\ True\kern0.5em Positives}{\left( True\ Positives + False\ Negatives\right)} $$
$$ FNR= False\ Negative\ Rate = \frac{False\ Negative s}{\left( True\ Positives + False\ Negative s\right)} $$
$$ FPR= False\ Positive\ Rate= Fallout = \frac{False\ Positive s}{\left( True\ Negatives + False\ Positive s\right)} $$
$$ TNR= Negative\ Negative\ Rate= Specificity = \frac{\ True\ Negative s}{\left( True\ Negative s + False\ Positives\right)} $$
$$ PPV= Positive\ Predictive\ Value = \frac{\ True\ Positive s}{\left( True\ Positive s + False\ Positive s\right)} $$
$$ NPV= Negative\ Predictive\ Value = \frac{\ True\ Negative s}{\left( True\ Negative s + False\ Negative s\right)} $$
$$ RPP= Rate\ of\ Positive\ Prediction = \frac{\ True\ Positive s + False\ Positive s}{\left( True + False\right)} $$
$$ RNP= Rate\ of\ Negative\ Prediction = \frac{\ True\kern0.5em Negative s + False\ Negative s}{\left( True\ Prediction s + False\ Prediction s\right)} $$
$$ ACC= Accuracy\ of\ Classifier = \frac{\ True\kern0.5em Positives + True\ Negatives}{\left( True\ Predictions + False\ Predictions\right)} $$
$$ MCC=\frac{\left( True\ Positives\ *\ True\ Negatives\right)\left( False\ Positives\ *\ False\ Negatives\right)}{\sqrt{\left( True\ Positives+ False\ Positives\right)\left( True\ Positives+ False\ Negatives\right)\left( True\ Negatives+ False\ Positives\right)\left(\ True\ Negatives+ False\ Negatives\right)}}. $$
$$ FMeasure=F\_ score= 2 \times \frac{\left( Precision* Recall\right)}{\left( Precision+ Recall\right)} $$
Several illustrative examples below demonstrate different features of PROPER. An example presented in Fig. 2 illustrates PROPERs functions, i.e. optimization, comparison, and visualization, applied to independent training and testing sets of data from a study on prediction of protein sequence crystallizability [4]. In this study, we have used a dataset of 5691 protein sequences in negative set and 4924 protein sequences in positive set. For each protein sequence 48 different features were calculated and fed into machine learning methods. This data is available at http://ffas.burnham.org/XtalPred/help.html. After loading the data, optimization of model’s structure, e.g. selection of ANN learning algorithm, is performed by generating two- and three-dimensional performance curves and then similar curves are generated to compare performance of different optimized models. ANN training begins with initial random weights for each feature and, after each iteration, a learning algorithm changes these weights to reach the highest level of accuracy. Figure 2(a-d) shows differences in performance of four standard learning algorithms applied to training of ANN on this database. More examples and detailed information about installing PROPER is available from user manual that could be downloaded from the distribution directory at sourceforge.