GRN Test Results

Evaluation of submissions


Participant Submission
Format
S D I M SER Recall Precision F1 Relax SER Relax Recall Relax Precision Relax F1
U. of Ljubljana
a2 8 50 6 30 0.73 0.34 0.68 0.45 0.64 0.43 0.86 0.58
K.U.Leuven sif 15 53 5 20 0.83 0.23 0.50 0.31 0.66 0.40 0.88 0.55
TEES-2.1 a2 9 59 8 20 0.86 0.23 0.54 0.32 0.76 0.33 0.78 0.46
IRISA-TexMex a2 27 25 28 36 0.91 0.41 0.40 0.40 0.60 0.72 0.69 0.70
EVEX sif 10 67 4 11 0.92 0.13 0.44 0.19 0.81 0.24 0.84 0.37

Legend
  • S: substitutions; the predicted arc type is different from the reference arc type
  • D: deletions; there is no predicted arc corresponding to the reference arc (false negative)
  • I: insertion; there is no reference arc corresponding to the predicted arc (false positive)
  • M: matches; the predicted and reference arcs have the same type
  • SER = (S + D + I) / N, where N is the number of arcs in the reference
  • Recall = M / N
  • Precision = M / P, where P is the number of predicted arcs (P = S + I + M)
  • F1: harmonic mean of Precision and Recall
This is the main evaluation. The primary ranking criterion is the strict SER.

Evaluation algorithm
The evaluation algorithm operates pair by pair (of genes). For each pair, it tries to maximize Matches, then to maximize Substitutions. For instance, let's consider a pair of genes, where the reference says one
Inhibition arc and one Transcription arc. Here is the error count for the following predictions:
  • Inhibition and Regulation: 1 Match, 1 Substitution
  • Activation and Binding: 2 Substitutions
  • Inhibition, Requirement and Binding: 1 Match, 1 Substitution, 1 Insertion
  • Regulation: 1 Substitution, 1 Deletion
Relaxed scores
The relaxed scores are computed the same way except that Substitutions are considered as Matches. This is an attempt to score the predictions regardless of the arc types. However it does not take into account the redundancy of arcs. For interpretation purposes, the following table is more accurate.

Network shape evaluation

Participant S D I M SER Recall Precision F1 Relax SER Relax Recall Relax Precision Relax F1
IRISA-TexMex 0 19 22 62 0.51 0.77 0.74 0.75 0.51 0.77 0.74 0.75
U. of Ljubljana
0 44 5 37 0.60 0.46 0.88 0.60 0.60 0.46 0.88 0.60
K.U.Leuven 0 47 5 34 0.64 0.42 0.87 0.57 0.64 0.42 0.87 0.57
TEES-2.1 0 52 8 29 0.74 0.36 0.78 0.49 0.74 0.36 0.78 0.49
EVEX 0 60 4 21 0.79 0.26 0.84 0.40 0.79 0.26 0.84 0.40
Attention: team ranks are different

This evaluation has been done as if all arcs in the reference and in the prediction were of type Regulation. Redundant arcs were removed. Notice that, as expected, Substitutions is always equal to zero, and that relaxed scores are strictly equal to strict scores.
This evaluation gives the accuracy of the prediction regardless of the type of the arcs, more accurately than relaxed scores in the previous evaluation. A good score means that the prediction reproduces accurately the shape of the network. The gap between this evaluation and the previous one indicates the (in)accuracy of predicted arc types.

Valued network evaluation

Participant S D I M SER Recall Precision F1 Relax SER Relax Recall Relax Precision Relax F1
U. of Ljubljana
10 45 6 27 0.74 0.33 0.63 0.43 0.62 0.45 0.86 0.59
K.U.Leuven 16 47 5 19 0.83 0.23 0.48 0.31 0.63 0.43 0.88 0.57
TEES-2.1 8 53 8 21 0.84 0.26 0.57 0.35 0.74 0.35 0.78 0.49
IRISA-TexMex 27 20 24 35 0.87 0.43 0.41 0.42 0.54 0.76 0.72 0.74
EVEX 10 61 4 11 0.91 0.13 0.44 0.21 0.79 0.26 0.84 0.39

In the same way as the previous evaluation, the arcs of type Binding and Transcription have been turned into Regulation. Again, redundant arcs have been removed. In this way the network only contains Regulation, Activation, Inhibition and Requirement arcs. In fact interactions of the mechanism axis have been removed, leaving only interactions of the effect axis. Some Systems Biology applications only need this kind of information. The gap between this evaluation and the first one indicates the (in)accuracy of predicted arcs of type Binding and Transcription.
Comments