Evaluation Files Format
This document define the files format of:
We use the simple text file format specified below to describe problem instances (Markov networks). The format is a generalization of the Ergo file format initially developed by Noetic Systems Inc. for their Ergo software. We use the .uai suffix for the challenge benchmark network files.
A file in the UAI format consists of the following two parts, in that order:
<Preamble> <Function tables>
The contents of each section (denoted <...> above) are described in the following:
Our description of the format will follow a simple Markov network with three variables and two functions. A sample preamble for such a network is:
MARKOV 3 2 2 3 2 2 0 1 2 1 2
The preamble starts with one line denoting the type of network. Generally, this can be either BAYES (if the network is a Bayesian network) or MARKOV (in case of a Markov network). However, note that this year all networks will be given in a Markov networks (i.e. Bayesian networks will be moralized).
The second line contains the number of variables. The next line specifies the cardinalities of each variable, one at a time, separated by a whitespace (note that this implies an order on the variables which will be used throughout the file). The fourth line contains only one integer, denoting the number of cliques in the problem. Then, one clique per line, the scope of each clique is given as follows: The first integer in each line specifies the number of variables in the clique, followed by the actual indexes of the variables. The order of this list is not restricted. Note that the ordering of variables within a factor will follow the order provided here.
Referring to the example above, the first line denotes the Markov network, the second line tells us the problem consists of three variables, let's refer to them as X, Y, and Z. Their cardinalities are 2, 2, and 3 respectively (from the third line). Line four specifies that there are 2 cliques. The first clique is X,Y, while the second clique is Y,Z. Note that variables are indexed starting with 0.
In this section each factor is specified by giving its full table (i.e, specifying value for each assignment). The order of the factor is identical to the one in which they were introduced in the preamble, the first variable have the role of the 'most significant' digit. For each factor table, first the number of entries is given (this should be equal to the product of the domain sizes of the variables in the scope). Then, one by one, separated by whitespace, the values for each assignment to the variables in the function's scope are enumerated. Tuples are implicitly assumed in ascending order, with the last variable in the scope as the 'least significant'. To illustrate, we continue with our Markov network example from above, let's assume the following conditional probability tables:
The correspoding function tables in the file would then look like this:
2 0.436 0.564 4 0.128 0.872 0.920 0.080 6 0.210 0.333 0.457 0.811 0.000 0.189
(Note that line breaks and empty lines are effectively just a whitespace, exactly like plain spaces " ". They are used here to improve readability.)
To sum up, a problem file consists of 2 sections: the preamble and the full the function tables, the names and the labels. For our Markov network example above, the full file will look like:
MARKOV 3 2 2 3 3 1 0 2 0 1 2 1 2 2 0.436 0.564 4 0.128 0.872 0.920 0.080 6 0.210 0.333 0.457 0.811 0.000 0.189
Evidence is specified in a separate file. This file has the same name as the original network file but with an added .evid suffix. For instance, problem.uai will have evidence in problem.uai.evid. The file starts with a line specifying the number of evidences samples. The evidence in each sample, will be written in a new line. Each line will begin with the number of observed variables in the sample, followed by pairs of variable and its observed value. The indexes correspond to the ones implied by the original problem file. If, for our above example, we want to provide a single sample where the variable Y has been observed as having its first value and Z with its second value, the file example.uai.evid would contain the following:
1 2 1 0 2 1
Note that although the format allows for multiple evidence, in the current competition we will either have no evidence or a single evidence line.
The first line must contain only the task solved: PR|MPE|MAR|BEL.
The rest of the file will contain the solution for the task.
Solvers can write more then one solution by writing -BEGIN- at the head of the new solution.
We will only consider the last solution in the file.
PR <SOLUTION> -BEGIN- <SOLUTION>
The first line in each solution will contain the number of evidence samples. This will be the number of lines (not include this line) in the solution part. Hence each line from here will contain the solution with a different sample of evidence - ordered as in the evidence file. If there is no evidence (the first line of the evidence file is 0), the output should include the results for the empty evidence scenario. This is regarded as a single-evidence case - one with the empty evidence.
Solvers that can bound their estimation are encouraged to specify if their solution is lower or upper bound. Doing so by adding at the end of the solution the letters L(for lower bound) or U (for upper bound).
The line format is as follows (according to the task):
Here is a complete example for a solution for the MPE task. The evidence file contains one evidence samples.
MPE 1 4 0 2 0 4 -BEGIN- 1 4 0 2 0 4
If a solver does not produce a solution by the given time, it would be considered as having failed on the instance. This will be treated as equivalent to a naive solution (e.g. bit-wise singleton clique maximum for a MAP problem).
Some example problems and outputs are available here.
| Contact admin |
2010 (C) The Hebrew University of Jerusalem, All rights reserved