Inference in Graphical Model
When we talk about inference in the model, means we get a graphical model at hand and want to know some kind of probability.
Three different kind of inference problem:
1. Marginal Query:
2. Conditional Query:
3. MAP Query:
For 1,2 we will use same algorithm, for 3 we need a different algorithm.
When we perform inference, we need to decide is exact inference or approximate inference. Variable elimination, belief propagation for exact inference, MCMC and variational inference for approximate inference. We will talk about them in detail.
Variable Elimination
Variable elimination works in the following way, we need to sum out variable
One important question would be this: we get some observation variable
Variable elimination can be used to solve marginal probability and conditional probability. But not MAP inference, MAP inference will be solved by another method: max-product. The formula would be
Variable elimination works by eliminating one variable at one step. When we want to eliminate one variable, we need to combine all the factors realted to variable into one new factor without target variable. From this step, we know the complexity of variable elimination: The largest intermediate factor during the elimination process.
Belief Propagation
Another algorithm for inference is belief propagation. Variable elimination is suitable for BN & MN and for any graph structure. But Belief propagation called exact inference only for some kind of graph structure, i.e. tree structure. When belief propagation works on graph with loop, it’s called loopy belief propagation and the result is not exact, depend on specific structure and other details. So we will first introduce how to run belief propagation on tree structure graph.
Belief propagation is also one kind of message passing algorithm for inference, sometimes also called sum-product algorithm.
If the graphical model are not exactly tree form, we may need to convert this graphical model into Junction Tree or Clique Tree or Join Tree, a form we can run inference easier.
Get Clique Tree
Since we need to run BP on a clique tree, we need to know how to get the clique tree from directed or un-directed graphical model. We get two different ways to perform this task.
1. Run variable elimination algorithm and generate the clique tree based on the intermediate clique generated.
2. Triangularize the graph and construct clique tree accordingly.
Clique Tree, Property and other stuffs
When we talk about clique tree, junction tree, join tree and factor graph, we should know that this is not a new type of graphical model: Directed and Undirected graphical model is. But all these are different representation of graphical model, it’s used for inference. But we might want to know why we want to use clique tree, what’s the property of it and what makes it a clique tree.
We know the potential function is the building block for probability density function in the graphical model. There are a related concept: Cluster Graph. Cluster graph is a graph whose vertex is Cluster, a Cluster is composed of multiple random variables and has following properties:
1. Family preservation
For each potential function
One important property make a cluster graph a clique tree: RIP, Running Intersection Property. RIP means if two cluster
Generate Clique Tree
In fact, we can construct cluster graph and make it has RIP, then we can use it for inference. But we can get it in a easier way.
1. Run Variable Elinination
each intermediate clique as a cluster.
2. Triangularization the graph
3. Use Bethe cluster graph (or Factor graph, i think this is another name)
In this cluster graph, each potential function and random variable is a cluster. If one random variable belong to a specific potential function, then an edge will connect two clusters.
####Run Belief Propagation
We need to define message when running belief propagation. Message is a probability density function over a set of random variables. When we get a clique tree, we can define message between different tree node.
We define the sepset (separator set) as the intersection between tow different tree node. Then a message is in fact a marginal probability distribution over the sepset.
Since the cluster is a set of random variables
where
From now, the belief propagation (sum-product) algorithm is clear:
1. Initialize the cluster potential and message.
2. calculate the message, which is just get all the information from neighbors and send own belief on the sepset.
Comments
Variable elimination and belief propagation is very important and fundamental methods for graphical model inference, but not very helpful. Because most of the graphical model are not tree form.
with StackEdit.