Programming Exercises

PGM 2004/05 - Programming Ex1

Hidden Markov Models (HMMs)

Submission: 24/11 at Ross closing time

Don't forget to read the exercise guidelines

In this exercise you will build a system that given a structure and parameters of a HMM, finds the likelihood of an observation and the most probable explanation for it. Your system will also be able to learn or adapt the parameters of a HMM given training sequences/instances.

Part 1: Tasks

Implement the functions described in part 2 and use them to
  1. Use the HMM defined by A, B and Q to
    1. Generate 100 samples of length 20
    2. Use EM to estimate the parameters of the model starting with the true model (don't forget to accumulate counts over all instances and not instance by instance - this is a direct consequence of the independence of instances assumption).
    3. Plot the log-likelihood of the data (sum of log-likelihood of each instance) as a function of iteration.
    4. How do your estimated parameters compare to the true ones?
      What happens when you increase the number of samples?
      What happens when you increase the length of each sample?
    5. Repeat parameter estimation starting with 10 random starting points.
      What can you say about EM as a learning algorithm?

  2. For each of the samples you generated in the previous section, use the estimated parameters to perform "decoding" in two ways: (a) find the MAP assignment using the Viterbi algorithm, (b) for each time step separately, find the state that maximizes the posterior probability. For each of these decoding methods calculate the "bit error rate" (P(x(t) ~= x*(t)) that is the probability over all instances and time slices that the predicted state by MAP does not match the true state. Compare this to the "word error rate" (P(x ~= x*)) or the probability that any time-slice in an instance is prediction with error. Which methods performs better under each error measure? What happens when you increase the length of the samples? Note: Although some number may appear small (in the order of 10e-10) all examples have been checked and there should be no problem with scaling. That said, you should do as many calculations using log to make your code scalable for other problems.

    Important: Don't forget to discuss all of the experiments in the evaluation file. You are expected to show insight and understanding of what is going on and not merely report results.

    Part 2: Code

    For this section you will use:

    Your should implement the following:

    1. L = Likelihood(A,B,Q,Y)
      This function should return the likelihood of the observation in Y using the HMM model defined by A,B and Q.

    2. [P,X] = MAP(A,B,Q,Y)
      This function should return the most probable a-posteriori assignment given the instance of observables Y as a vector X with the MAP state assignment (numbered from 1 to N) for each time-slice. It should also return the probability of the MAP assignment in P.

    3. [NewA,NewB,NewQ] = TrainHMM(A,B,Q,I)
      This function should start with the parameters in A, B and Q and return two new matrices after EM training based on I (a matrices of vector instances).

    Also, submit any other code/functions you used for this exercise.

    Submission

    You should tar only your .m file along with the EVALUATION file and figure files (in EPS or PS format) and submit it through the submission link in the course home page. Do not forget to register to the course first through the register link. DO NOT gzip your tar file.

    Please make sure the .m file names match the method names and are EXACTLY as specified (this part will be checked automatically). Also, make sure your login appears at the top of each file in a matlab comment