Latent Dirichlet Allocation
Mostly based on Murphy's book, with unified notations.
Generative model
\[\begin{aligned}
\boldsymbol{\pi}_{i}|\alpha &\sim Dir(\alpha\boldsymbol{1}_{K})\\
q_{il}|\boldsymbol{\pi}_{i} &\sim Cat(\boldsymbol{\pi}_{i})\\
\boldsymbol{b}_{k}|\gamma &\sim Dir(\gamma \mathbf{1}_{V})\\
y_{il}|q_{il}=k,\boldsymbol{b}_{k} &\sim Cat(\boldsymbol{b}_{k})
\end{aligned}
\]
- \(K\): total number of topics
- \(\boldsymbol{\pi}_{i}\): distribution of topics of words of document \(i\)
- \(q_{il}\): topic of \(l\)th word in document \(i\)
- \(V\): total number of words
- \(y_{il}\): \(l\)th word in document \(i\)
- \(\boldsymbol{b}_{k}\): \(k\)th topic's word distribution
- \(L_{i}\): length of document \(i\)
- \(N\): total number of documents
Gibbs sampling
\[\begin{aligned}
p(q_{il}=k|\cdot)&\propto exp[log\boldsymbol{\pi}_{i,k}+log\boldsymbol{b}_{k,y_{il}}]\\
p(\pi_{i}|\cdot)&=Dir({\alpha\boldsymbol{1}_{K}+\sum_{l}\mathbf{I}(q_{il}=k)})\\
p(b_{k}|\cdot)&=Dir({\gamma\boldsymbol{1}_{V}+\sum_{i}\sum_{l}\mathbf{I}(y_{il}=v,q_{il}=k)})
\end{aligned}
\]
Dirichlet distribution is conjugate prior for Categorical distribution.
The first can be obtained based on line 2 and 4 of the generative model.
The second can be obtained based on line 1 and 2 of the generative model.
The third can be obtained based on line 3 and 4 of the generative model.
Collapsed Gibbs sampling
\[\begin{aligned}
p(q_{il}=k|\cdot)&\propto \frac{c^{-}_{vk}+\gamma}{c^{-}_{k}+V\gamma} \frac{c^{-}_{ik}+\alpha}{L_{i}+K\alpha}\\
\end{aligned}
\]
This can be directly obtained from the 3 conditional distributions mentioned in Gibbs sampling and a property of Dirichlet distribution, i.e.
\[E[X_{i}]=\frac{\alpha_{i}}{\sum_{k=1}^{K}\alpha_{k}}
\]