Fig. 3
From: A close look at protein function prediction evaluation protocols

Label distribution comparison between CV, NA and NP. First we computed the probability (number of annotated proteins/number of all proteins) of GO category i in the training and test sets for all three setups, denoted by \(p_{i}^{\text {tr}}\) and \(p_{i}^{\text {tst}}\), respectively; in the CV setup the calculation was performed five times for each fold and averaged across the five folds. The discrepancy for category i is then defined as: \(|p_{i}^{\text {tr}} - p_{i}^{\text {tst}}| / (p_{i}^{\text {tr}} + p_{i}^{\text {tst}})\). The average discrepancy is shown in top left panel. p-values based on paired t-tests for CV vs NA and CV vs NP in all three subontologies for both species are less than 1Eā4 or 10ā4. The individual signed discrepancy values (without the absolute value) are shown in the other three panels in sorted order by their magnitude for each setup