Home
Overview
File Formats
Evaluation Criteria
Registration
Login
Examples
UAI Conference
Last evaluation, UAI'08
 
Leader Board
Live Leader Board
 
Results and Summary
Solver desciprtion
 
 
 

The evaluation process

New to this year's competition, participants in the evaluation will receive performance reports and will be able to submit improvements to their solvers during the entire competition period. A leader board will be continuously updated. The criteria for the evaluation of the different tasks are as follows:

  • PR: The score for the partition function, evaluated only for those models for which we were able to obtain exact answers (given extensive time and memory resources), is as follows: Denote the exact partition function by Z* and the approximated one by Zs. The score will be |log(Z*/Zs)|.
  • MPE: The performance of the most probable explanation estimate will be computed relative to the performance of the other competitors a simple asynchronous belief propagation baseline and a default result. Thus we will also evaluate this task on models where MPE cannot be computed exactly. The default result is the assignment that maximize only the one variable factors (and the first value if no such factor exist for some variable). The score will be calculated as follows. Denote the energy of a solution x by E(x). The energy is E(x) = - ∑ log fa(Xa = xa). We denote the best result by x* = arg maxs ∈ S E(xs) where S denote the group of all the solvers. We denote the standard BP result as xbp and the default result by xdef. Solvers scores will be relative to the BP or the default solution. The score will be:

  • MAR: The score for the marginals, evaluated only for those models for which we were able to obtain exact answers, will be calculated as follows. Denote the exact marginal for the i variable taking the x value as:P*(Xi = xi). In the same way the solver marginal will be denoted by: Ps(Xi = xi). The score will be:

  • BEL The score for the learning task will be calculated only for models that we could evaluate (and in relatively short time) the exact marginal of the model. The learned network marginal will be computed exactly and compared with the true marginals of the model. The score will be calculate as in the MAR task.





| Contact admin |

2010 (C) The Hebrew University of Jerusalem, All rights reserved